I needed another way so I extended BinaryWriter (mostly to get it to serialize the types I needed it to) and now my 60 bytes take 68 bytes to serialize. In the process of writing this class, I looked through the disassembled BinaryWriter using JetBrains's dotPeek and found this little gem (Write7BitEncodedInt(int)) and decided it'd make a great dotPeek of the Week:
protected void Write7BitEncodedInt(int value) { uint num = (uint) value; while (num >= 128U) { this.Write((byte) (num  128U)); num >>= 7; } this.Write((byte) num); }This is how the BinaryWriter.Write(string) encodes the length of the string. Thus, if you have a string with fewer than 128 characters, it only takes one byte to encode the length. In my implementation, I used this to specify the length of collections too. In fact, when writing a 32 bit signed integer, you should be able to save space for all positive numbers less than 2 ^ 22 and break even (require four bytes) for 2 ^ 29. Encoding an Int32 that is greater than or equal to 2 ^29 will require five bytes. Thus, if your integers tend to be smaller than 536,870,912, you'll probably save space encoding this way. A similar function could be used for a long where all positive values less than 2 ^ 57 will result in at least breaking even.
Here's how it works:
 Convert the number to an unsigned int so you can do arithmetic on positive numbers (after all, you're just writing bits)

While the converted value is greater than or equal to 128 (i.e., 8 bits)
 Write low 7 bits and put a 1 in the high bit (to indicate to the decoder more bytes are coming)
 Shift the 7 bits you just wrote off the number
 When the loop finishes, there will be 7 or fewer bits to write so write them
protected internal int Read7BitEncodedInt() { // some names have been changed to protect the readability int returnValue = 0; int bitIndex = 0; while (bitIndex != 35) { byte currentByte = this.ReadByte(); returnValue = ((int) currentByte & (int) sbyte.MaxValue) << bitIndex; bitIndex += 7; if (((int) currentByte & 128) == 0) return returnValue; } throw new FormatException(Environment.GetResourceString("Format_Bad7BitInt32")); }Here's how this works:
 Set up an int to accumulate your data as you read them
 Set up a place to keep track of which 7bit block you're reading
 While your bit index is less than 35 (i.e., you've read no more than 5 bytes comprising 1 "more bytes" indicator and 7 data bits)
 Read a byte
 Take the byte you just read and logical conjuction (bitwise and) it with sbyte.MaxValue (127 or in binary 0111 1111)
 Left shift those 7 bits to the position in which they belong
 Use a logical disjunction (bitwise or) to write those seven bits into your accumulator int
 Add 7 to your bit index for the next 7 bits you read
 If the current byte you read does not have a 1 in its most significant bit (i.e, byte & 1000 0000 == 0)
 There are no more bytes to read so just return the current accumulator value
 If you get to this point, you've read a 6th of 5 bytes and it's time to let the user know there was a formatting problem
No comments:
Post a Comment