Plus, in addition to unaligned access, which is a lot slower already, you need an additional shift instruction (and sign extension) to access your data. Opposed to all that is saving 3 bytes of memory, which is nothing (except if you have a million elements).
If your struct has more elements, you may even be able to use the currently unused space by cleverly arranging elements (the compiler might for example fill bool values into the holes).
Also, speaking of "arranging", you generally help the compiler a lot to produce good code if you group things of the same size and put the "larger" types (unsigned long int in this case) first and the "smaller" ones (like char[5]) second. That ensures that the compiler can properly align everything without wasting a lot of space. In this particular case, it makes no difference, but with more elements, it does. There are a few good tutorials out there how to efficiently arrage structs, but I was unable to find one right now.
If you write a device driver and need that structure to access a hardware register with a specific layout, then of course you must use -fpack-struct and use that very layout.