WWW.DUMAIS.IO
ARTICLES
OVERLAY NETWORKS WITH MY SDN CONTROLLERSIMPLE LEARNING SWITCH WITH OPENFLOWINSTALLING KUBERNETES MANUALLYWRITING A HYPERVISOR WITH INTEL VT-X CREATING YOUR OWN LINUX CONTAINERSVIRTIO DRIVER IMPLEMENTATIONNETWORKING IN MY OSESP8266 BASED IRRIGATION CONTROLLERLED STRIP CONTROLLER USING ESP8266.OPENVSWITCH ON SLACKWARESHA256 ASSEMBLY IMPLEMENTATIONPROCESS CONTEXT ID AND THE TLBTHREAD MANAGEMENT IN MY HOBBY OSENABLING MULTI-PROCESSORS IN MY HOBBY OSNEW HOME AUTOMATION SYSTEMINSTALLING AND USING DOCKER ON SLACKWARESYSTEM ON A CHIP EMULATORUSING JSSIP AND ASTERISK TO MAKE A WEBPHONEC++ WEBSOCKET SERVERSIP ATTACK BANNINGBLOCK CACHING AND WRITEBACKBEAGLEBONE BLACK BARE METAL DEVELOPEMENTARM BARE METAL DEVELOPMENTUSING EPOLLMEMORY PAGINGIMPLEMENTING HTTP DIGEST AUTHENTICATIONSTACK FRAME AND THE RED ZONE (X86_64)AVX/SSE AND CONTEXT SWITCHINGHOW TO ANSWER A QUESTION THE SMART WAY.REALTEK 8139 NETWORK CARD DRIVERREST INTERFACE ENGINECISCO 1760 AS AN FXS GATEWAYHOME AUTOMATION SYSTEMEZFLORA IRRIGATION SYSTEMSUMP PUMP MONITORINGBUILDING A HOSTED MAILSERVER SERVICEI AM NOW HOSTING MY OWN DNS AND MAIL SERVERS ON AMAZON EC2DEPLOYING A LAYER3 SWITCH ON MY NETWORKACD SERVER WITH RESIPROCATEC++ JSON LIBRARYIMPLEMENTING YOUR OWN MUTEX WITH CMPXCHGWAKEUPCALL SERVER USING RESIPROCATEFFT ON AMD64CLONING A HARD DRIVECONFIGURING AND USING KVM-QEMUUSING COUCHDBINSTALLING COUCHDB ON SLACKWARENGW100 MY OS AND EDXS/LSENGW100 - MY OSASTERISK FILTER APPLICATIONCISCO ROUTER CONFIGURATIONAASTRA 411 XML APPLICATIONSPA941 PHONEBOOKSPEEDTOUCH 780 DOCUMENTATIONAASTRA CONTACT LIST XML APPLICATIONAVR32 OS FOR NGW100ASTERISK SOUND INJECTION APPLICATIONNGW100 - DIFFERENT PROBLEMS AND SOLUTIONSAASTRA PRIME RATE XML APPLICATIONSPEEDTOUCH 780 CONFIGURATIONUSING COUCHDB WITH PHPAVR32 ASSEMBLY TIPAP7000 AND NGW100 ARCHITECTUREAASTRA WEATHER XML APPLICATIONNGW100 - GETTING STARTEDAASTRA ALI XML APPLICATION

C++ JSON LIBRARY

2012-07-09

After spending some time trying to find a good JSON library for C++, I realized that all the libraries out there are too heavy to use. Some of them look very good but their usage looks heavy. So I decided to write my own. My library is compliant with RFC 4627 except that it doesn't support unicode and numbers in exponential format .

Seriously, this library is really easy to use and has no dependencies (other than STL). I cannot find another C++ json library that is that simple to use.

Usage examples

The library exposes one object that is used to do everything you need. the "JSON" object. So there is no need to include a whole bunch of header files and use a whole bunch of class. You only need the JSON object to do everything you need. The object exposes these methods:

MethodDescription
JSON& operator[](int i);Access a list item
JSON& operator[](std::string str);Access an object member
std::string str();Get value of item
void parse(std::string json);Parse a JSON document
std::string stringify(bool formatted=false);serialize JSON object
JSON& addObject(const std::string& name="");Add object
JSON& addList(const std::string& name="");Add List
JSON& addValue(const std::string& val,const std::string& name="");Add string value
JSON& addValue(int val, const std::string& name="");Add integer value
JSON& addValue(double val, const std::string& name="");Add double precision FP value

Reading a JSON document

The library is very simple to use. Just compile it and it will output a "test" executable and a .a that you can link against. Then let's say you have the following JSON document:

{ "obj1":{ "member1":[ "val5", "val4" ], "member2":"val3" }, "list1":[ "listItem1", "listeItem2", { "listObject1":"val2" } ], "value1":"val1" }

The following code is an example on how to use the library:

JSON json; std::string val; std::string str = someFunctionThatReadsAJSONDocumentFromFileOrNetworkOrWhatever(); json.parse(str); val= json["obj1"]["member1"][0].str(); // would give "val5" val= json["list1"][1]["listObject1"].str(); // would give "val2"

Each access to a member will return a JSON object. So you only have 1 class to use at all time. So you can create a new variable each time or you can access all members by chaining the function calls.

val= json["obj1"]["member1"][0].str(); // would give "val5" JSON& j1 = json["obj1"]; JSON& j2 = j1["member1"]; val = j2[0].str(); // would give "val5" val = j2.str(); // would give "{list}" val = j1.str(); // would give "{object}"

Invalid paths

The nice thing about this is that you don't need to worry about null objects. If you try to access an invalid member, you will get an invalid JSON object. But if you try to access another member from an invalid JSON object, you will also get an invalid JSON object. You will never get a NULL object that could crash your application.

val= json["obj1"][1].str(); // would give "{invalid}" because obj1 is an object, not a list val= json["obj2"].str(); // would give "{invalid}" val= json["obj2"]["member2"].str(); // would also give "{invalid}" val= json["list"][100].str(); // would give "{invalid}" val= json["list"][100]["something"].str(); // would also give "{invalid}"

Writing a JSON document

There are 3 methods provided to add items in the JSON document:

  • addObject(const std::string& keyName="")
  • addList(const std::string& keyName="")
  • addValue(const std::string& val, const std::string& keyName="")

All 3 functions have an optional keyName parameter. That is because if you add an item to an Object, you need to specify the key name that will be used in the parent. Again, I wanted to have a simple interface without having to force the programmer to use different classes if using a list or an object. So this here is the behavior of those function calls if you provide the key name or not.

ActionResult
addXXX on Object and provide key name item added and parent uses keyName
addXXX on Object and don't provide key name item added and a key name is auto generated
addXXX on List and provide key name item added and key name ignored
addXXX on List and don't provide key name item added
addXXX on Value and provide key name operation is ignored
addXXX on Value and don'tprovide key name operation is ignored

After adding items in the json object, you can then serialize it with stringing().

JSON json; json.addObject("obj1"); json.addList("list1"); json.addValue("val1"); // key autogenerated because key name not provided json["list1"].addValue("item1"); json["list1"].addValue("item2"); std::string val = json.stringify(true);

Would output:

{ "obj1":{ }, "list1":[ "item1", "item2" ], "key2":"val1" }

Download

Project can be found on github

IMPLEMENTING YOUR OWN MUTEX WITH CMPXCHG

2012-06-28

The cmpxchg instruction takes the form of "cmpxchg destination source" where the destination is a memory location and the source is a register. Before using this instruction, you need to load a value in the EAX register. The instruction will first compare the value in EAX to the value in memory pointed by the destination operand. If both values are equal, the value of the source operand will be loaded in memory where the destination operand points to. Note that this compare and store operation is done atomically. If, on the other hand, the destination and EAX do not match, then the destination will be loaded into eax. At first, it might not be clear why this instruction would be usefull. But consider this:

l2: mov eax,[mutex] cmp eax,1 je l2 mov eax,1 l3: mov [mutex],eax

This is an unsafe way of creating a mutex. You loop until its value is zero and then set a 1 in it. But what if another thread or another CPU changed the value between l2 and l3?

If you need to store the value of a lock in memory (let's say at location 0x12345678) then before attempting to lock a section of code, you would read the lock to see if it is free. So you would read location 0x12345678 and test if this value is zero. If it isn't, then keep on reading memory until it reads as zero (because some other thread cleared it). After that, you would need to store a "1" in this location to take ownership of the lock. But what if another thread takes ownership between the time you read the value and the time you wrote it? The CMPXCHG instruction will write a "1" in there only if a "0" was in memory first. EAX would be equal to "0" because we would first spin until the memory value is "0". So after that, we tell the CPU: "EAX is zero now, so compare value at 0x12345678 with EAX (thus 0) and change it to 1 if it is equal. Otherwise, if the value at 0x12345678 is not equal to 0 anymore, then load this value into EAX and I will go back to spinning until I get a zero". Simple enough? Here is a sample code that illustrates this.

mov edx,1 l2: mov eax,[mutex] cmp eax,1 je l2 ; spin until we see that eax == 0 lock cmpxchg [mutex],edx; At this point, eax=0 for sure. Now if memory location still equal to ; eax, then store edx in there. ; otherwise, eax will be loaded with changed value of mutex (should be 1) ; if not equal to zero, it means it was modified. If it was modified, jnz l2 ; it means cmpxchg has loaded the value of the mutex in it. ; and if the value of mutex was loaded, it means it wasn't equal to zero ; by the definition of the CMPXCHG instruction. ; zf will have been set in that case, so we can just make a conditional jump

Now, notice how we used "lock" before using cmpxchg? This is because we want the CPU to lock the bus before doing the operation so that no other CPU will interfere with that memory location.

WAKEUPCALL SERVER USING RESIPROCATE

2012-06-14

This is my first project I did with the resiprocate SIP stack. There's a lot of things left to do in this project but I wanted to post the code here right away in case someone needs more example on how to use resiprocate.

Dependencies and limitations

I chose to use resiprocate as the SIP stack and ortp as the RTP stack and libxml2 and the XML parser. The application only supports G.711 uLaw. The application only supports SIP info for receiving DTMF (inband and RFC2833 not supported).

Resiprocate

Resiprocate provides a Dialog Usage Manager (DUM). This engine is very useful for applications that don't want to deal with low level SIP messages. The DUM allows you to receive events such as onOffered, onAnswer, onTerminated (plus many more) by the use of an observer pattern. Using a class called AppDialogSet, it is possible to represent a "call" or a "dialog" and let the DUM manage it. For example, you could override the AppDialogSetFactory with your own CallFactory that would create "Call" objects derived from AppDialogSet. When receiving an event such as onOffered, the DUM will already have created a AppDialogSet with your factory class and you can then cast this AppDialogSet with your "Call". This is a good way to receive a "Call" reference on every events you get. And the beauty of this is that you never need to delete it becausr the DUM will take care of it. More information is available on the resiprocate website.

ortp

ortp is very easy to use but only provides basic functionalities. It won't bind to any sound cards or include encoding like other fancy stack do. This stack only allows you to open a stream and feed it data encoded with whatever codec you want. It is the developper's responsibility to make sure that the data that is fed is encoded with the proper codec.

Threading model

I chose to use 1 thread for general processing and 1 thread for each RTP session. The main thread is used to give cycles to the resiprocate DUM and to the WakeupCallService. A new thread is created for each RTP sessions. The RTP session only handles outgoing stream since we don't need the incomming stream. The ortp stack provides a way to read multiple streams from the same thread but I prefer to use different threads in order to leverage multi-cores CPUs.

Usage

The server is a user agent that registers with you PBX. Just call the server and enter the time at wich you want your wakeup call and the extension at which you wanna be notified. For example, you would enter 0,6,3,0 to get a wakeup call at 6h30 AM. I left out the prompts from the package so you'll want to replace them. The IVR is defined in the xml file. Just change the prompt names. There is no configuration file you can use right now. You will need to set the proper values that you need in config.h. To launch the application, run it and provide, as a command line argument, the ip address on which to bind on your computer.

Download

Download the source code

FFT ON AMD64

2012-06-05

Fast Fourier Transform with x86-64 assembly language

This is an old application I did a while ago. I did this in 2005 when I got my first 64bit CPU (AMD). The first I did after installing my new CPU was to open VI and start coding an FFT using 64 bit registers. This is old news, but 64 bit at that time was awesome. Not only can you store 64 bits in a register, but you get 32 general purpose registers!

The only really annoying thing with this architecture is that they don't provide a bit reveral instruction. I don't understand why a simple RISC processor like the AVR32 (lookup "brev") has one but not a high end CISC like Intel or AMD. I don't actually show the bit reveral part of the FFT in here though.

By the way, I remember doing some tests with this algorithm and, although I don't remember the results exactly (7 years ago), I remember that it was running at least 5 times faster than most other FFTs in other libraries.

//; x8664realfft(float* source,float** spectrum,long size) x8664realifft: mov $1,%eax cvtsi2ss %eax,%xmm10 pshufd $0b00000000,%xmm10,%xmm10 mov $-1,%eax cvtsi2ss %eax,%xmm10 pshufd $0b11000100,%xmm10,%xmm10 jmp fftentry x8664realfft: mov $1,%eax cvtsi2ss %eax,%xmm10 pshufd $0b00000000,%xmm10,%xmm10 fftentry: pushq %rbp movq %rsp,%rbp pushq %rbp subq $0xFF,%rsp movq %rsp,%rbp //; make a 16bytes aligned buffer addq $16,%rbp andq $0xFFFFFFFFFFFFFFF0,%rbp pushq %r15 pushq %r14 pushq %r13 pushq %r12 pushq %r11 pushq %r10 pushq %r9 pushq %r8 //; rcx = size movq %rdx,%rcx pushq %rcx //; rdx = source mov %rdi,%rdx pushq %rdx //; rdi = spectrum[0] movq (%rsi), %rdi addq $8, %rsi //; rsi = spectrum[1] movq (%rsi), %rsi //; r8 = log2(N), r14= N pushq %rcx fld1 fild (%rsp) xorq %r8,%r8 pushq %r8 fyl2x fistp (%rsp) popq %r8 popq %r14 //; bit reversal has already been done prior to calling this function //; r9 = nLargeSpectrum //; r10 = nPointsLargeSpectrum movq %r14,%r9 movq $1,%r10 movq $1,%r11 mov %rdi,%r14 mov %rsi,%r15 //;load 2PI in st(0) fldpi fldpi faddp %st(0),%st(1) movq %r8,%rcx l1: pushq %rcx shrq $1,%r9 shlq $1,%r10 //;st(0) = theta, st(1) = 2pi fld %st(0) pushq %r10 fidiv (%rsp) popq %r10 //;xmm0 = 2*costheta[0],2*costheta[0],2*costheta[0],2*costheta[0] //; st(0) = theta, st(1) = 2pi pushq %rax fld %st(0) fcos fstp (%rsp) movss (%rsp),%xmm0 pshufd $0b00000000,%xmm0,%xmm0 popq %rax addps %xmm0,%xmm0 movq %r9,%rcx l2: pushq %rcx //; r12 = point1 (index *4bytes) r13 = point2 (index *4bytes) movq %r10,%r12 movq %r9,%rax subq %rcx,%rax pushq %rdx mulq %r12 popq %rdx movq %rax,%r12 movq %r11,%r13 addq %r12,%r13 shlq $2,%r13 shlq $2,%r12 //; xmm2 = costheta[2],sintheta[2],costheta[1],sintheta[1] movq %r12,16(%rbp) decq 16(%rbp) fld %st(0) fimul 16(%rbp) fsincos fstp (%rbp) fstp 4(%rbp) decq 16(%rbp) fld %st(0) fimul 16(%rbp) fsincos fstp 8(%rbp) fstp 12(%rbp) movaps (%rbp),%xmm2 pshufd $0b10110001 ,%xmm2,%xmm2 //;xmm1 = costheta[1],sintheta[1],0,0 movhlps %xmm2,%xmm1 movq %r11,%rcx l3: //; recurrence formula //; xmm3 = w.re,w.im,w.re,w.im movaps %xmm2,%xmm3 mulps %xmm0,%xmm3 subps %xmm1,%xmm3 movlhps %xmm3,%xmm3 movaps %xmm2,%xmm1 movaps %xmm3,%xmm2 mulps %xmm10,%xmm3 //; xmm5 := c.im,c.re,c.re,c.im movq %r14,%rdi movq %r15,%rsi addq %r13,%rdi addq %r13,%rsi movss (%rdi),%xmm5 pshufd $0b00000011,%xmm5,%xmm5 addss (%rsi),%xmm5 pshufd $0b00101000,%xmm5,%xmm5 //; xmm3 := inner product: re,re,im,im mulps %xmm3,%xmm5 pshufd $0b11011101 ,%xmm5,%xmm3 pshufd $0b10001000 ,%xmm5,%xmm5 addsubps %xmm5,%xmm3 pshufd $0b10101111,%xmm3,%xmm3 //;xmm6 := sortedArray[point1].re,sortedArray[point1].re,sortedArray[point1].im,sortedArray[point1].im movq %r14,%rdi movq %r15,%rsi addq %r12,%rdi addq %r12,%rsi movss (%rdi),%xmm6 pshufd $0b00001111,%xmm6,%xmm6 addss (%rsi),%xmm6 pshufd $0b11100000,%xmm6,%xmm6 addsubps %xmm3,%xmm6 pshufd $0b00100111,%xmm6,%xmm6 movss %xmm6,(%rdi) pshufd $0b11100001,%xmm6,%xmm6 movss %xmm6,(%rsi) movq %r14,%rdi movq %r15,%rsi addq %r13,%rdi addq %r13,%rsi pshufd $0b01001110,%xmm6,%xmm6 movss %xmm6,(%rdi) pshufd $0b11100001,%xmm6,%xmm6 movss %xmm6,(%rsi) //; increase point1 and point2 by 4 bytes (each index represent a float) addq $4,%r12 addq $4,%r13 decq %rcx jnz l3 popq %rcx decq %rcx jnz l2 //; remove theta from fpu stack fstp %st(0) shlq $1,%r11 popq %rcx decq %rcx jnz l1 popq %rdx //; rcx is already pushed in stack cvtsi2ss (%rsp),%xmm1 pshufd $0b00000000,%xmm1,%xmm1 popq %rcx shrq $2,%rcx movq %r14,%rdi movq %r15,%rsi //; is this a ifft or a fft? cvtss2si %xmm10,%eax cmp $-1,%eax jne nrm cp: movaps (%rdi),%xmm2 movntdq %xmm2,(%rdx) addq $16,%rdi addq $16,%rdx loop cp jmp cleanexit nrm: movaps (%rdi),%xmm2 movaps (%rsi),%xmm3 divps %xmm1,%xmm2 divps %xmm1,%xmm3 movntdq %xmm2,(%rdi) movntdq %xmm3,(%rsi) addq $16,%rdi addq $16,%rsi loop nrm cleanexit: fstp %st(0) popq %r8 popq %r9 popq %r10 popq %r11 popq %r12 popq %r13 popq %r14 popq %r15 addq $0xFF,%rsp popq %rbp leave ret