Bitcoin Source Code Study Notes (Part 2)

Chapter 2

This chapter introduces the process of Bitcoin client serializing data following the transaction creation in the previous chapter.

All serialization functions of the Bitcoin client are implemented in seriliaze.h, among which the CDataStream class is the core structure of data serialization.

CDataStream

CDataStream has a character class container to store serialized data. It combines a container type and a stream interface to process data. It uses 6 member functions to implement this functionality:

 class CDataStream
{
protected:
    typedef vector<char, secure_allocator<char> > vector_type;
vector_type vch;
unsigned int nReadPos;
short state;
short exceptmask;
public:
int nType;
int nVersion;
//......
}

vch stores serialized data. It is a character container type with a custom memory allocator. The memory allocator will be called by the implementation of the container when it needs to allocate/release memory. The memory allocator will clear the data in the memory before releasing the memory to the operating system to prevent other processes on the local machine from accessing the data, thereby ensuring the security of data storage. The implementation of the memory allocator is not discussed here, and readers can find it in serialize.h.
nReadPos is the starting position of vch to read data.
state is an error indicator. This variable is used to indicate errors that may occur during serialization/deserialization.
exceptmask is the error mask. It is initialized to ios::badbit | ios::failbit. Similar to state, it is used to indicate the type of error.
The value of nType is one of SER_NETWORK, SER_DISK, SER_GETHASH, SER_SKIPSIG, and SER_BLOCKHEADERONLY, and its function is to notify CDataStream to perform a specific serialization operation. These five symbols are defined in an enumeration type enum. Each symbol is an int type (4 bytes), and its value is a power of 2.

 enum
{
// primary actions
    SER_NETWORK = (1 << 0),
    SER_DISK = (1 << 1),
    SER_GETHASH = (1 << 2),
// modifiers
    SER_SKIPSIG = (1 << 16),
    SER_BLOCKHEADERONLY = (1 << 17),
};

nVersion is the version number.

CDataStream::read() and CDataStream::write()

The member functions CDataStream::read() and CDataStream::write() are low-level functions used to perform serialization/deserialization of CDataStream objects.

 CDataStream& read(char* pch, int nSize)
{
        // Read from the beginning of the buffer
assert(nSize >= 0);
        unsigned int nReadPosNext = nReadPos + nSize;
        if (nReadPosNext >= vch.size())
{
            if (nReadPosNext > vch.size())
{
                setstate(ios::failbit, "CDataStream::read() : end of data");
                memset(pch, 0, nSize);
                nSize = vch.size() - nReadPos;
}
            memcpy(pch, &vch[nReadPos], nSize);
nReadPos = 0;
vch.clear();
return (*this);
}
        memcpy(pch, &vch[nReadPos], nSize);
        nReadPos = nReadPosNext;
return (*this);
}
 CDataStream& write(const char* pch, int nSize)
{
        // Write to the end of the buffer
assert(nSize >= 0);
        vch.insert(vch.end(), pch, pch + nSize);
return (*this);
}

CDataStream::read() copies nSize characters from CDataStream to a memory space pointed to by char* pch. The following is its implementation process:

Calculate the end position of the data to be read from vch, unsigned int nReadPosNext = nReadPos + nSize.
If the end position is greater than the size of vch, there is not enough data to read. In this case, set the state to ios::failbit by calling the function setState() and copy all zeros to pch.
Otherwise, call memcpy(pch, &vch[nReadPos], nSize) to copy nSize characters starting from position nReadPos of vch to a pre-allocated memory pointed to by pch, and then move forward from nReadPos to the next starting position nReadPosNext (line 22).

This implementation shows that 1) after a piece of data is read from the stream, it cannot be read again; 2) nReadPos is the read position of the first valid data.

CDataStream::write() is very simple. It appends nSize characters pointed to by pch to the end of vch.

Macros READDATA() and WRITEDATA()

The functions CDataStream::read() and CDataStream::write() are used to serialize/deserialize primitive types (int, bool, unsigned long, etc.). To serialize these data types, pointers to these types are converted to char*. Since the size of these types is now known, they can be read from CDataStream or written to a character buffer. Two macros for referencing these functions are defined as helpers.

 #define WRITEDATA(s, obj) s.write((char*)&(obj), sizeof(obj))
#define READDATA(s, obj) s.read((char*)&(obj), sizeof(obj))

Here is an example of how to use these macros. The following function will serialize an unsigned long type.

 template<typename Stream> inline void Serialize(Stream& s, unsigned long a, int, int=0) { WRITEDATA(s, a); }

Replace WRITEDATA(s, a) with its own definition. Here is the expanded function:

 template<typename Stream> inline void Serialize(Stream& s, unsigned long a, int, int=0) { s.write((char*)&(a), sizeof(a)); }

This function accepts an unsigned long parameter a, gets its memory address, converts the pointer to char* and calls the function s.write().

Operators << and >> in CDataStream

CDataStream overloads operators << and >> for serialization and deserialization.

 template<typename T>
    CDataStream& operator<<(const T& obj)
{
        // Serialize to this stream
        ::Serialize(*this, obj, nType, nVersion);
return (*this);
}
template<typename T>
    CDataStream& operator>>(T& obj)
{
        // Unserialize from this stream
        ::Unserialize(*this, obj, nType, nVersion);
return (*this);
}

The header file serialize.h contains 14 overloads of these two global functions for 14 primitive types (signed and unsigned versions of char, short, int, long and long long, as well as char, float, double and bool) and 6 overloads for 6 composite types (string, vector, pair, map, set and CScript). So, for these types, you can simply use the following code to serialize/deserialize data:

 CDataStream ss(SER_GETHASH);
ss<<obj1<<obj2; //Serialization ss>>obj3>>obj4; //Deserialization

If no implementation type matches the second argument obj, the following generic T global function will be called.

 template<typename Stream, typename T>
inline void Serialize(Stream& os, const T& a, long nType, int nVersion=VERSION)
{
    a.Serialize(os, (int)nType, nVersion);
}

For this generic version, type T should be used to implement a member function with signature T::Serialize(Stream, int, int). It will be called via a.Serialize().

How to serialize a type

In the previous introduction, generic type T needs to implement the following three member functions for serialization.

    unsigned int GetSerializeSize(int nType=0, int nVersion=VERSION) const;
    void Serialize(Stream& s, int nType=0, int nVersion=VERSION) const;
    void Unserialize(Stream& s, int nType=0, int nVersion=VERSION);

These three functions will be called by their corresponding global functions with generic type T. These global functions are called by the overloaded operators << and >> in CDataStream.

A macro IMPLEMENT_SERIALIZE(statements) is used to define the implementation of these three functions for any type.

 #define IMPLEMENT_SERIALIZE(statements) \
    unsigned int GetSerializeSize(int nType=0, int nVersion=VERSION) const \
    {\
        CSerActionGetSerializeSize ser_action; \
        const bool fGetSize = true; \
        const bool fWrite = false; \
        const bool fRead = false; \
        unsigned int nSerSize = 0; \
        ser_streamplaceholder s; \
        s.nType = nType; \
        s.nVersion = nVersion; \
        {statements}\
        return nSerSize; \
    } \
    template<typename Stream>\
    void Serialize(Stream& s, int nType=0, int nVersion=VERSION) const \
    {\
        CSerActionSerialize ser_action; \
        const bool fGetSize = false; \
        const bool fWrite = true; \
        const bool fRead = false; \
        unsigned int nSerSize = 0; \
        {statements} \
    } \
    template<typename Stream>\
    void Unserialize(Stream& s, int nType=0, int nVersion=VERSION) \
    { \
        CSerActionUnserialize ser_action; \
        const bool fGetSize = false; \
        const bool fWrite = false; \
        const bool fRead = true; \
        unsigned int nSerSize = 0; \
        {statements} \
}

The following example demonstrates how to use this macro.

 #include <iostream>
#include "serialize.h"
using namespace std;
class AClass {
public:
    AClass(int xin) : x(xin){};
int x;
    IMPLEMENT_SERIALIZE(READWRITE(this->x);)
}
int main() {
CDataStream astream2;
AClass aObj(200); //An AClass type object with x being 200 cout<<"aObj="<<aObj.x>>endl;
asream2<<aObj;
AClass a2(1); //Another object with x as 1 astream2>>a2
cout<<"a2="<<a2.x<<endl;
return 0;
}

This program serializes/deserializes the AClass object. It will output the following result on the screen.

 aObj=200
a2=200

These three serialization/deserialization member functions of AClass can be implemented in one line of code:

IMPLEMENT_SERIALIZE(READWRITE(this->x);)

The definition of the macro READWRITE() is as follows

 #define READWRITE(obj) (nSerSize += ::SerReadWrite(s, (obj), nType, nVersion, ser_action))

The expansion of this macro is placed in all three functions of the macro IMPLEMENT_SERIALIZE(statements). Therefore, it needs to complete three things at a time: 1) return the size of the serialized data, 2) serialize (write) data to the stream; 3) deserialize (read) data from the stream. Refer to the definition of these three functions in the macro IMPLEMENT_SERIALIZE(statements).

To understand how the macro READWRITE(obj) works, you first need to understand where nSerSize, s, nType, nVersion and ser_action come from in its full form. They all come from the three function bodies of the macro IMPLEMENT_SERIALIZE(statements):

nSerSize is an unsigned int, initialized to 0 in all three functions;
ser_action is an object that is declared in three functions, but of three different types. It is in the three functions CSerActionGetSerializeSize, CSerActionSerialize and CSerActionUnserialize respectively;
s is defined as ser_streamplaceholder type in the first function. It is the first parameter passed to the other two functions and has parameter type Stream;
nType and nVersion are input parameters in all three functions.

So, once the macro READWRITE() expands to the macro IMPLEMENT_SERIALIZE(), all its symbols will be evaluated, because they already exist in the body of the macro IMPLEMENT_SERIALIZE(). The expansion of READWRITE(obj) calls a global function ::SerReadWrite(s, (obj), nType, nVersion, ser_action). Here are all three versions of this function.

 template<typename Stream, typename T>
inline unsigned int SerReadWrite(Stream& s, const T& obj, int nType, int nVersion, CSerActionGetSerializeSize ser_action)
{
    return ::GetSerializeSize(obj, nType, nVersion);
}
template<typename Stream, typename T>
inline unsigned int SerReadWrite(Stream& s, const T& obj, int nType, int nVersion, CSerActionSerialize ser_action)
{
    ::Serialize(s, obj, nType, nVersion);
return 0;
}
template<typename Stream, typename T>
inline unsigned int SerReadWrite(Stream& s, T& obj, int nType, int nVersion, CSerActionUnserialize ser_action)
{
    ::Unserialize(s, obj, nType, nVersion);
return 0;
}

As you can see, the function ::SerReadWrite() is overloaded into three versions. Depending on the last parameter, it will call the global functions ::GetSerialize(), ::Serialize() and ::Unserialize() respectively; these three functions have been introduced in the previous chapters.

If you check the last parameter of the three different versions of ::SerReadWrite(), you will find that they are all empty types. The only purpose of these three types is to distinguish the three versions of ::SerReadWrite(), which are then used by all functions defined by the macro IMPLEMENT_SERIALIZE().

<<: Japan leads Bitcoin price to new high, approaching all-time high of $1,277

>>: Tencent releases blockchain white paper, aiming to build an enterprise-level blockchain infrastructure platform

What facial features are the standard characteristics of good fortune?

Bitcoin Source Code Study Notes (Part 2)

Chapter 2

CDataStream

CDataStream::read() and CDataStream::write()

Macros READDATA() and WRITEDATA()

Operators << and >> in CDataStream

How to serialize a type

What facial features are the standard characteristics of good fortune?

How to read a man's lips

Is the child line in palmistry really accurate? Does it mean that there is no child line?

What kind of man will neglect his family after marriage?

Moles on the face are good luck if they grow in these three places

Face shape prediction method: face shape types and destiny

Girls with split wisdom lines have a tendency to be stingy.

Palmistry to analyze emotions: Who has a particularly good relationship with others?

Always pay attention to your appearance

How to read the marriage line in palmistry: separation, divorce and remarriage

Recommend

What does a scheming woman look like? What kind of woman is too scheming?

Are people with upturned chins lucky? What will their personalities be like?

Facial features that indicate bad luck and how to resolve it!

Alibaba Financial Cloud may provide a blockchain-based cloud service platform

Bitcoin's market value is equivalent to the 18th largest commercial company in the United States

Three types of faces that will make you rich and powerful in middle age

Bitcoin is rising as risk aversion intensifies

What does a mole on the left shoulder mean? A woman with a mole on her left shoulder

What are the characteristics of a wealthy face?

Myanmar trials blockchain technology for microfinance

During the BTC bull market, which concept currencies rose?

What does a woman with light eyebrows mean?

How to judge the bad luck of husband and brother from face reading

What industry is suitable for you based on your facial features?

Double 12 big gift package! ZMC officially announced that it will distribute shares worth more than 50 million yuan