Pergelator: Linux Think

Sunday, December 19, 2010

Linux Think

I've been working on porting some code for a microcontroller from a commercial compiler to gcc, which is a free one. I would hate to be in the commercial compiler outfit's shoes these days, but then I am 90% unemployed these days, so that would make them better off than me. Hmmm.

Anyway, when dealing with microcontrollers, you spend a lot of time writing code that interfaces directly with the hardware, often in terms of registers. Registers are a set of eight bits, where each bit has some distinct purpose. To fulfill that purpose, you need to be able to turn those bits on and off, without disturbing any of the other bits in the register. Typically this is done by reading the eight bits from the register, performing a logical AND or OR operation with a mask, and then writing the eight bit value back into the register.

A mask is just a set of eight bits, each set a certain way, so that when you perform the logical operation, the bit you intend to change is indeed changed. For instance, suppose you want to turn on bit zero of Register_A. You would make a mask that was all zeros except for one one. In binary it would look like 00000001. In decimal it would just be 1.

You would read Register_A, OR that value with your mask, and then write it back out to the register. For instance:

mask = 1;
temp = REGISTER_A;
temp = temp OR mask;
REGISTER_A = temp;

You can make a mask for each of the eight bits in a byte, like this:

Bit 0 00000001
Bit 1 00000010
Bit 2 00000100
Bit 3 00001000
Bit 4 00010000
Bit 5 00100000
Bit 6 01000000
Bit 7 10000000

But all these zeros can make you blind, and if you get the wrong the number of zeros to the right of the one, everything you do with it would be wrong. So people have come up with a number of clever ways to make masks. You could use decimal numbers:

Bit 0 1
Bit 1 2
Bit 2 4
Bit 3 8
Bit 4 16
Bit 5 32
Bit 6 64
Bit 7 128

But you can see where that might lead to confusion, since someone reading this would have to recognize that the decimal number corresponds to a power of two, and therefor has only one bit set. A slightly better way is to use hexadecimal values:

Bit 0 1
Bit 1 2
Bit 2 4
Bit 3 8
Bit 4 0x10
Bit 5 0x20
Bit 6 0x40
Bit 7 0x80

This tells the reader that you are at least thinking about this number as something other than just a quantity. You could use a leading 0x0 before the first four values if you want, it doesn't make any difference to the compiler.

A technique I ran across a few years ago when I first started working with the AVR is using the left shift operation to define these masks. For instance:

#define bit_0_mask (1 << 0)
#define bit_1_mask (1 << 1)
#define bit_2_mask (1 << 2)
#define bit_3_mask (1 << 3)
#define bit_4_mask (1 << 4)
#define bit_5_mask (1 << 5)
#define bit_6_mask (1 << 6)
#define bit_7_mask (1 << 7)

This has the advantage of taking equivalent numerical values out the equation completely. It shows that you explicitly mean the bit in that position. Whatever numerical value it may have is irrelevant.

Now we come to gcc. One of things software people often try to do is to write their programs so they will run on any machine. One of the tricks they will use is to define something two different ways, and then use the something when they write their program. Now when you want to run the program on one machine, you use one definition of something. When you want to run on another machine, you use the other something.

Linux comes from the Unix school of programming, and Unix has been around since the age of dinosaurs, i.e. the big mainframes. Machines were so widely varied back then, and time was so expensive, that programmers developed a set of definitions that boggles the mind. That tradition continues, and today you cannot pick up a piece of Linux source code that is not absolutely riddled with conditional compilation controls.

But this time they went too far. Using the left shift operator was not good enough. They had to make a macro out of it. They called it _BV, which in the land of microcontrollers usually means Battery Voltage. Not in this case, and they won't tell you what the definition is. I'm sure the definition is in an include file somewhere, but I couldn't find it. I deduced what it must be from context. A rational definition would probably look something like this:

#define _BV(p) (1 << p) I imagine the gcc definition probably runs four pages and covers every machine ever made, including some that no longer exist any more. Okay, you Linux weenies. This is too much. Bit masks for use with hardware are MACHINE DEPENDENT. They will never work with another machine. EVER. So you don't need a friggin' macro for this, especially since you aren't going to tell anyone what it is.

I originally wrote this two weeks ago, but then I fell in a hole and forgot about it. I just polished it up and posted it.

2 comments:

BadTux said...: Uhm, excuse me? I maintain a Linux driver and I assure you that a disk drive is a disk drive regardless of whether it's connected to a big endian ARM or a little endian Intel, and its bits are its bits in its SCSI mode pages regardless of what kind of hardware your host computer is on. Same deal with the TCP/IP stack if doing iSCSI. So yes, a macro that abstracts out testing or setting bit [n] is useful... otherwise I end up with explicit duplicated code to do this all over and if I didn't get my coffee that morning, I might code one of them wrong.

As for compiler writers, I know a few, and they're quite well employed thank you. They just end up working on gcc while employed by Intel or by one of the ARM vendors, that's all.; December 21, 2010 at 10:47:00 AM PST
Chuck Pergiel said...: After I got done fuming over your response, I thought about it for a while.

I still see no benefit in using the BV macro. Are you saying you are more likely to type BV(3) correctly than to type (1<<3) correctly?

I still haven't seen the definition of BV.

I am sure some compiler writers are still employed, but I imagine a company that sells compilers may not be doing very well.; December 27, 2010 at 8:19:00 PM PST