UKOUG / LOSUG Jan 2015

Or, Down the Rabbit Hole.

I checked the archives and the last event I went to was in May 2012. Everything was exactly the same as it was then, although my extended absence just heightened to me how surreal the whole event is.

I don’t mean that in a bad way at all. Quite the opposite. Peter Tribble gave an interesting talk entitled “Adventures with illumos”. Myself and the other middle aged geeks were treated with another journey down memory lane where I heard words long absent from my ears like “Sun”, “Solaris” and “CDE”.

The irony is, a lot of the technology is new, and as Peter ably demonstrated is actively being worked on. What made it surreal is how far divorced it all is from the “real” i.e. commercial world, especially if you have been exclusively in that environment for a few years, dealing with big vendors, clients and “The Cloud”.

There’s something very “British”, and heartening about how, hobbyists (can’t think of a better description) dedicate their time to preserve and develop something that is of value because they (we) love the technology: the effort and creativity that is embedded in it and the respect for the people who have worked on it before. People are not doing this to become the next dot com with A,B and C round funding. They’re doing it because they love it.

As time goes by and the rise of the machines continues, this community could form the core of a whole new field. In the compressed timescales of IT, the rate of change and the knowledge that could be lost is a real risk. Britain is very good at preserving heritage; in IT terms, Bletchley Park is the obvious example that springs to mind and you can find sites like http://www.ourcomputerheritage.org/. These however, are mainly concerned with hardware; hardware that can be identified with a specific country.

Code is different. It is global. It is like DNA, literally “code”. It contains lots of redundancy and continually evolves. How do we preserve it? Should we preserve it? Is anyone else writing this story down? Who is carrying on the work of Peter Salus and his history of Unix http://www.amazon.com/Quarter-Century-UNIX-Peter-Salus/dp/0201547775?

Great to have a dose of un-commerciality once in a while.

UKOUG / LOSUG May 2012

Late, but as promised, a quick blog on Dr. Clive King’s presentation to the OpenSolaris User Group in September.

Another great talk which will particularly stick in my mind for the anecdote about the customer who bought larger and larger systems to solve performance problems and ultimately found out that RAM was being artificially limited by a system setting. Ouch. That came from Clive’s tip #1: don’t copy /etc/system settings!

I noted down 10 tips in total and got links to other good blogs like Gerry Haskin’s blog.

By the time the room got hot and stuffy, we were into the realms of chip performance, large pages and MMU traps so we had covered a large spectrum.

If I do any Solaris performance tuning in the future, this talk will certainly be my starting point.

LOSUG – UKOUG September – and other things

LinkedIn helpfully tells everyone it is 99 days since I wrote a blog post. Thanks LinkedIn. A few non-IT projects have been in progress (and summer holidays as well).

On the IT side, I have recently managed to download ESXi5.0 and install it easily on a USB stick. I’ve imported guests from my datastore and got the appliance based virtual center running. All very easy, I am happy to report!

Touching on IT, I have built a VM and installed Magento e-commerce web shopping software for a related project.

For a client, I have written a few simple load generating scripts in perl to help testing a virtual environment. It’s not as easy as you think to generate memory load. Just assigning a large chunk of RAM doesn’t work as the host operating system (Linux in this case) notices the memory is not being accessed and pages it out over time. You can see it happening in a nice graph in virtualcenter. ESXi will probably try to do something clever too, even if the O/S doesn’t. To keep RAM in use, you need to continuously access it, which I did with random accesses into an array.

I made it along to the September OpenSolaris User Group meeting where Nick Todd gave a talk on the Solaris linker and Alastair Lumsden gave an update on the OpenIndiana project (which I am downloading now).

Nick’s talk was entitled “The Missing Link”. Apt, as we all tend to take that step for granted but there’s a decent amount of engineering in there. It was interesting to note that even in the days of card decks, you nearly always had to “bracket” your deck with pre- and post- instructions to tell the machine what to do with your deck. That concept lives on in the elf file format where executable code is prefixed by crt1 and crtn code.

There are two main aspects of linking: the link editor and the runtime linker and a set of Solaris commands to aid development and debugging, not least of which is “elfdump”.

This talk was fascinating, not least because it simply reminded us that this goes on and contained plently of tips and places for further reading. (I will insert links when I get them).

Alastair gave an update on OpenIndiana (1 year old!) and the upcoming 151a stable release. OpenIndiana is based on Solaris 11 express and Illumos but future releases will fork from Oracle and enable innovation and new features. KVM has been added and GCC will be used as a compiler. The combination of these various technologies: KVM, Qemu, Illumos, ZFS, Crossbow, Zones, Dtrace is a potent mix.

Not least the consideration that the source code is freely available and if you are serious about security there is no substitute for examining and compiling the code yourself. Particularly with recent hacks against the Linux kernel.

LOSUG February – DTrace

Finally. I have a chance to write up the interesting introduction to DTrace given to a well attended group by Jim Mauro. This man loves to talk about DTrace, a fact which came across quickly on the night (and he also told us the same)!

I’ve hardly used DTrace myself, having spent the latter part of my Unix sysadmin career mainly on Linux platforms. All the text below is straight from my notes of Jim’s talk. Inaccuracies will be all mine.

First of all: the shameless plug. Yes, The DTrace Cookbook will be available soon – 1200 pages of DTrace tips and recipes. See www.dtracebook.com (actually can’t find this but turned up some youtube videos with Jim talking about the book).

Jim wanted a main take-away from the talk to be that DTrace was complicated, but by neccessity as it was designed to look at complex systems. It is like an MRI scan. The output is complex but, like an MRI scan operator you don’t need much experience to use or learn. Interpreting the output is where experience counts.

Within the DTrace toolkit, which is all open source, you get DTrace plus perl or shell scripts. The three main DTrace components are Probes, Providers and Consumers.

Probes can insert codes dynamically to unmodified running code by altering it’s image in memory. Typically this is done at the entry or exit point of functions.

A Provider is a library of probes and used to manage probes with sensible names, e.g. IO. In Jim’s experience 50% of problems are due to IO. There is a lot of code written to do disk IO and a key question is often who is starting the IO.

Blank fields in a provider specification match all four probe fields. (Unfortunately I didn’t get the examples down. Incidentally the slides should be available from the losug website http://opensolaris.org/jive/forum.jspa?forumID=64).

Consumers are the commands: dtrace, lockstat, plockstat and intrstat.

DTrace User Components: comprise predicates and actions. Traditional performance analysis involves gathering a lot of data followed by 80% of the effort pruning the data down and 20% of the time looking at the resulting good data. Predicates in DTrace do this pruning for you.

A D program: syscall is a very useful provider. e.g. collect some data on entry point of all syscalls. Use D when the cli gets complex. DTrace has aggregating functions and variables. “@” indicates and aggregating variable which are akin to associative arrays – the index is a dtrace variable.

Getting Started: DTrace was created to debug production systems. Previously the right tools were not available. You had to core dump a running system! DTrace is safe and the probe effect is minimal. It has a built-in watchdog which turns DTrace off if it detects problems. DTrace is not necessarily the first tool to use.

Performance Metrics: How Fast (throughput) / How Long (latency) / How Many (IOPS) / How Much (utilisation).

DTrace – Getting the Big Picture: After the “stat” tools, use the “big” providers.

Getting Strated One-Liners: looking at CPU: profile provider, time based data collection. Use an odd number (because housekeeping is done every 10ms). tick can exit a script after a period. Even if you can’t read stack traces, you can get useful hints from looking.

System Metrics – Example: sysinfo procider.

Memory One-Liners: vminfo.

DTrace can “connect the dots”.

Well, that is the end of my notes, which doesn’t seem like much for 90 mins of fast chat. In my defence, Jim is a difficult talker to make notes on and there were a lot of examples! I noticed a camera at the back of the room on my way out so perhaps a video of the event is available from LOSUG.

There is, of course, a wikipedia entry for DTrace which can be found at http://en.wikipedia.org/wiki/DTrace.

While DTrace is undoubtably brilliant, the main drawback is of course that it is not available for more systems.

UKOUG/LOSUG November

Recently I’ve been very busy experimenting with vCloud Director in the office lab. It’s a complex beastie and many thanks to Duncan Epping for his excellent turorials. When I get more experience, I may blog on it but as I have a dozen things to do on my ever growing list, it seems unlikely. One quick note to self though: the strange error I got about a host already being controlled was due to agent confusion and I had to run vslad-uninstall.sh and re-install to fix.

Now, on to the subject. Alastair Lumsden introduced this months UKOUG/LOSUG speakers Tom Kranz and Peter Tribble attended by the usual suspects in the audience. Tom talked about “Exploring Solaris Auto Registration” – that part of Solaris in Update 9 which sends all your machine details to Oracle over the internet, oh yes. Now in 99% of cases you don’t want it to do this so he explained how to disable it and some of the commands involved: stclient, stlisten, stdiscover. All worth reading up on for the professional Solaris Admin.

Peter Tribble’s talk was titled “Sar – past present and future” and, lets face it, “past” is the key word here. No-one was arguing with that point of view but Peter showed a good, simple idea to allow people to keep historical data and use it to assist in problem diagnosis and as a source of graphs for management. Keeping historical data was/is one of sar’s useful aspects. Peter simply keeps kstat -p output and stores it in compressed format for future use. The storage required is not that large and you can quite easily keep a couple of years worth. He has written some tools to scan the compessed logs and mimic the output of stat commands so you can see what “mpstat 10” would have produced at midnight on February 20th last year, if you want.

Perhaps we’ll get a talk on sar’s older brother, process accounting one month 🙂 No, please, just joking.

Next month Alastair is planning workshops, running from late afternoon which could be interesting and in January Jim Mauro is in the country promoting the new dtrace book and we can look forward to a talk from him!

LOSUG September

I attended the Oracle LOSUG meeting on September 15th to hear a talk from Phil Kirk on Zones and Crossbow.

I also took the opportunity to meet Alasdair Lumsden (who has set up openindiana).

I scribbled down a few notes to help jog my memory.

HISTORY

  • Zones were never meant to be like VMs. They were designed as a process container.
  • Zones have a shared I/P stack and routing.
  • There is (typically) a separate I/P alias per zone.
  • IPMP works.
  • Config is done from the global zone.
  • IPfilter works (v4).
  • DHCP, IPsec, raw sockets don’t work.

Some problems with zones:

  • Non-global routing is affected by global routing table. (Some examples).
  • Using a null route is often used to add a gateway entry but this is where global routing table changes can break zones.
  • Default routes are selected round-robin.
  • defrouter option in the zone config just does a route add (nothing clever in the kernel).
  • inter-zone traffic can be forced to go over the wire. Normally it would go via loopback for efficiency but some sites require audit/logging of traffic.

NOW

  • Each zone gets its own I/P stack.
  • Config is done in the zone.
  • Lots of zones need lots of NICs.
  • Can mix shared and exclusive stacks.

CROSSBOW:

  • Virtualisation at the data (mac addr) level. vNICs.
  • vNIC gives b/w resource management (dladm).
  • vlans are supported in Crossbow.
  • P.S. What happened to my complimentary UKOUG membership?