It's pretty interesting to dive into the situation of recovering from unexpected reboots. Our usual lab setup consists of three parts:
Any of these could suffer unexpected power loss and subsequent power restore. The basic question is: what do we handle in the way of recovery?
For lots of things, it's necessary to maintain status. An example is the following: you are a scientist and use the above setup to set up and test your sensor. You leave the lab but then the PC unexpectedly reboots because a system administrator mistakenly remotely rebooted the PC.
When the EGSE software automatically starts again, should it attempt to initialize the biasing board? Probably not -- you may be running a test and the sensor settings should not be changed.
But then again, there is the situation of an expected power-up. You want to differentiate between the two, if you want your electronics to always be initialized upon normal startup.
Now there's complexity: both the EGSE and the Controller board will have to maintain state. Any discrepancies will have to be resolved between the two. In the end, it might be much simpler to just say that we do not support automatic initialization when the Controller board comes online.
Choices, choices...
When asking for some criticism from a colleague today, I got some additional pointers and ideas on my design sketch. Some concepts were implicit, but would be more clear when mentioned specifically:
The Controller board will carry state for the equipment that's in the rack, but since a (possibly accidental) power-down of the rack would lose state, the previously mentioned discovery mechanism still has to be created.
The software will also get a lot more simpler if we assume there is some intelligence in the rack. Thus, we can assume that a future rack will perform the following functions:
He also pointed out it's worth thinking about whether we must model the rack functions itself, perhaps as a class called RackDriver. The slots in the rack were previously left out of the model, because they did not have any meaning for the software. This now changes, since we assume the rack has some intelligence.
So far, the custom software we built at SRON assumed that users would manually configure the software when any hardware was replaced, and that there was only one electronics board present. The following is a preliminary design on the software changes required to support an array of boards that is placed in a standard format rack.
The basis is, that we are developing standard electronics boards for certain tasks. For instance, a standard design for an electronics card that contains a DAC and an ADC. These cards must work in array-like structures, for instance a standard 19 inch subrack which contains a number of these standard electronics boards. This could be used door biasing a sensor, reading out a sensor, housekeeping duties and PC communication duties. Such a rack would consist of a number of slots, each which consist of a rail that guides an inserted board to the backplane. This backplane provides power and connection to the software. The central requirement is: the user should be able to add and remove the boards without complex procedures. Any procedures like board-specific power-up/power-down sequences should be handled automatically.
The setup thus consists of two parts:
To support the above sketched requirements, we can recognize several use cases in this setup:
The user must be able to add or remove a board into the rack, and the software should detect this. Also, most boards must be initialized in some way. Thus there must be hooks that run a particular script when hardware changes. This also means that the hardware must actively identify itself, let the script take care of this, or give the software some uniform way of checking this. More on this later.
Replacing, or the moving of a board from one slot to another can be covered by a simple remove/add action.
Since the hardware and software can be powered on and off independently, both situations must be covered. Thus the software must have some sort of discovery mechanism when starting. The hardware must have some way of rate limiting if it actively advertises the adding or removing of a board. More on this later.
There are two possible ways in which a rack is powered down: expectedly and unexpectedly. The software does not need to be adapted either way. In the case of an expected power down, there should be a project-specific power down script. In the case of an unexpected power down, it should be determined whether the project needs a way of detecting this.
When the EGSE is powered up, it should see whether a rack is connected and if so, a discovery mechanism should see what boards are present. More on the discovery mechanism later. When the ESW is powered up, no particular actions are necessary.
There are two possible ways in which the EGSE is powered down: expectedly and unexpectedly. The software does not need to be adapted either way. In the case of an expected power down, there should be a project-specific power down script. In the case of an unexpected power down, it should be determined whether the project needs a way of detecting this.
The ESW can also be powered down, either accidental or as per request. There is no difference between the two, since the ESW functions as a pass-through and does not maintain state.
For the above use cases, the software obviously requires a constant and up to date register of all available boards plus their addresses. The following objects can be found in the use cases: rack, slot, board. A rack is divided in slots. A slot contains a board. Typically, racks can have shelves but for now, we assume that there's only one shelf. Also, racks are contained in a cabinet but again, there can be only one rack for now.
The current requirements do not necessitate that the software exactly knows which slots are occupied. Thus, this concept is currently not taken into account. That leaves us with the following classes:
There are two options for addressing. Currently, all boards have an address pre-programmed into the FPGA. This is fine in a situation where we can manually increment the unique address. The software will then simply use a discovery mechanism where a dummy request is sent to each possible address. When a reply is received, the board is then added to the present board list. Discovery must be quick since it inhibits other usage of the bus, and is done periodically. Thus the most logical place to run the discovery, is probably the ESW.
But when using multiple off-the-shelf boards, it is much easier to let the boards actively know that they were inserted, and let the software hand out addresses. The software still needs a discovery mechanism in case the software is brought down for some reason. This can be the same as previously mentioned.
In the first release:
For version two, we see the following points:
For version three, we see the following points:
We got a demo from the Coverity people. We ran their tool on our code base in advance. Via a WebEx session we got an explanation of the results, but first we got an overview of the company and their projects since some of the team were new to this stuff.
It's a pretty young company, founded less than ten years ago, and their aim is to deliver products that improve the quality of your software. Clients are in the medical and aerospace branche. Wikipedia article on Coverity. They have a 1000+ customers.
From the webbased Integrity Center software, several tools can be controlled. One of them is Static Analysis, called the Prevent tool. The tool identifies critical problems, not the more trivial things like style compliance etcetera.
Since bugs are cheaper to fix in development rather than in the field, this gives the user time and cost savings.
The software checks the compiler calls that are made when you do a build (via make) and then works on the code in the same way. It's not a replacement for unit tests. After running, a database of the results is written and there is a web frontend where you can read out the database.
The screen shows a number of defects, with filter options at the left. When clicking on a defect, you can see the code as well as th classification of the defect. Along with the classification, there is a short explanation of this type of issue. Clicking further will also give simple examples so you better understand the defect.
Each defect can be assigned to a certain team member. We have already invested in using Traq so I'm not so sure that's useful.
We had questions about finding concurrency problems. Coverity can help with this but they support pthread it of the box. Since we use QThreads, we should make a model for that library. However since we have the code available (Qt is open souce) and it's using PThreads, it's not a problem and Coverity will be able to pick it up automatically.
Besides the existing checks, it's possible to add your own checks. Perhaps you want to enforce a certain way in which you use an external library.
The software tries to be smart. For example sometimes you do some smart coding which usually triggers an error. Coverity will use heuristics and not report it if the rest of the code base shows that this is not something worth reporting.
We closed off the demo with a discussion on licensing. The account manager teams up with a technical consultant and together they pretty extensively work on the requirements and resulting cost savings. From that, the price is derived. There are other licensing models however.
If you're on Debian or Ubuntu Linux and you want to send a quick e-mail from a Perl script, use the Email::Send module (this module has been superceded by the Email::Sender module, but that one doesn't seem to be present in the Debian Stable package repository yet).
First, install the appropriate packages:
$ sudo apt-get install libemail-send-perl
Then use the following snippet:
use Email::Send;
my $message = <<'__MESSAGE__'; To: bartvk@example.com From: bartvk@example.com Subject: This is a mail from a Perl script
This is the body of an e-mail from a perlscript __MESSAGE__
my $sender = Email::Send->new({mailer => 'Sendmail'});
$sender->send($message);Now go and use this wisely, my young padawan. Some observations:
[[Weblog_entries_2009?]]
Articles, chronologically (latest first):
Not yet finished. Maybe will never be finished. Maybe they'll get deleted. Who knows?
Programming:
System administration:
Others:
Files: