blog

Experimenting with JSON (structured and unstructured data) and Go

As I said in my previous post, I've been experimenting with a few different languages lately, and I was curious to know how Go handles JSON - more specifically, I want to know how it handles structured and unstructured data. # Structured data Suppose we have a JSON string like this: ```json { "name": "John", "age": 34 } ``` To handle a data structure like this, we can first build a struct to describe its model: ```go // `json:___` instructs the Marshal/Unmarshal functions how to map // the properties from the JSON string with the struct. They are not // necessary in this case because they have the same name type Person struct { Name string `json:"name"` Age string `json:"age"` } ``` To convert from and to JSON, we can use the functions `json.Unmarshal` and `json.Marshal`, which are analogous to `JSON.parse` and `JSON.stringify` in JavaScript. Take a look at this example, where I take an input string, parse it into a struct, and then stringify the struct back: ```go package main import ( "encoding/json" "fmt" ) type Person struct { Name string `json:name` Age int32 `json:age` } func main() { inputString := []byte(` { "name": "John", "age": 34 }`) var person Person // Analogous to "person = JSON.parse(inputString)" json.Unmarshal(inputString, &person) // Result struct: {Name:John, Age:34} fmt.Printf("Result struct: %v\n", person) // Analogous to "stringifedPerson = JSON.stringify(person)" stringifiedPerson, _ := json.Marshal(&person) // Stringified from struct: {"Name":"John","Age":34} fmt.Printf("Stringified from struct: %s\n", stringifiedPerson) } ``` Not bad at all. But what about nested objects and arrays? Let's increase the complexity a little bit with this JSON: ```json { "name": "John", "age": 34, "address": { "street": "1 Front Street", "unitNo": 123 }, "cars": [ { "model": "Honda Civic", "year": 2015 }, { "model": "Toyota Corolla", "year": 2013 } ] } ``` Now we have nested objects and also an array of objects. Let's modify the struct(s) to handle this data: ```go type Address struct { Street string `json:"street"` UnitNo int32 `json:"unitNo"` } type Car struct { Model string `json:"model"` Year int32 `json:"year"` } type Person struct { Name string `json:"name"` Age int32 `json:"age"` Address Address `json:"address"` Cars []Car `json:"cars"` } ``` Now let's see what Go can do (I formatted the output so it is better to read): ```shell Result struct: { Name:John Age:34 Address:{Street:1 Front Street UnitNo:123} Cars:[ {Model:Honda Civic Year:2015} {Model:Toyota Corolla Year:2013} ] } Stringified from struct: { "name":"John", "age":34, "address":{ "street":"1 Front Street", "unitNo":123 }, "cars":[ {"model":"Honda Civic","year":2015}, {"model":"Toyota Corolla","year":2013} ] } ``` Very cool! It seems like nested properties are not a problem at all! # Structured data + optional properties The next thing I wanted to know is: what if I have optional properties? Besides just using the regular structs that we made, there are two additional combinations I want to test: 1. Using a pointer instead of a regular variable. I believe this will set the values as **null** if they are not present. 1. Using the option **omitempty** inside the **json:"__"** directive. I believe it will not have impact on how the JSON gets parsed, but it will have an impact on the result after it is stringified. To test this, I made 8 combinations of properties 1. Regular string property 1. Regular string property with **omitempty** 1. String pointer 1. String pointer with **omitempty** 1. Object property 1. Object property with **omitempty** 1. Object pointer 1. Object pointer with **omitempty** Here are my structs: ```go /* Naming: Ex = "explicit", no "omitempty" Om = with "omitempty" Val = not a pointer Ptr = pointer Obj = object */ type SubStruct struct { ExVal string `json:"exVal"` OmVal string `json:"omVal,omitempty"` ExPtr *string `json:"exPtr"` OmPtr *string `json:"omPtr,omitempty"` } type MainStruct struct { ExVal string `json:"exVal"` OmVal string `json:"omVal,omitempty"` ExPtr *string `json:"exPtr"` OmPtr *string `json:"omPtr,omitempty"` ExValObj SubStruct `json:"exValObj"` OmValObj SubStruct `json:"omValObj,omitempty"` ExPtrObj *SubStruct `json:"exPtrObj"` OmPtrObj *SubStruct `json:"omPtrObj,omitempty"` } ``` And here are my two repetitions. One of them is completely empty, while the other one is completely filled: ```go rep1 := []byte(` { }`) rep2 := []byte(` { "exVal": "string 1", "omVal": "string 2", "exPtr": "string 3", "omPtr": "string 4", "exValObj": { "exVal": "string 1", "omVal": "string 2", "exPtr": "string 3", "omPtr": "string 4" }, "omValObj": { "exVal": "string 1", "omVal": "string 2", "exPtr": "string 3", "omPtr": "string 4" }, "exPtrObj": { "exVal": "string 1", "omVal": "string 2", "exPtr": "string 3", "omPtr": "string 4" }, "omPtrObj": { "exVal": "string 1", "omVal": "string 2", "exPtr": "string 3", "omPtr": "string 4" } }`) ``` And here are the results: ##### All properties empty (struct) ```go { ExVal: OmVal: ExPtr: OmPtr: ExValObj: { ExVal: OmVal: ExPtr: OmPtr: } OmValObj: { ExVal: OmVal: ExPtr: OmPtr: } ExPtrObj: OmPtrObj: } ``` When unmarshalling (parsing), the **omitempty** directive apparently makes no difference. The difference we see relates to the type of the variable: if it is a regular string, it is empty, if it is a pointer, it is a null pointer. Now let's see with the filled properties. ##### All properties filled (struct) ```go { ExVal: string 1 OmVal: string 2 ExPtr: 0xc42000e250 OmPtr: 0xc42000e260 ExValObj: { ExVal: string 1 OmVal: string 2 ExPtr: 0xc42000e270 OmPtr: 0xc42000e280 } OmValObj: { ExVal: string 1 OmVal: string 2 ExPtr: 0xc42000e290 OmPtr: 0xc42000e2a0 } ExPtrObj: 0xc420078270 OmPtrObj: 0xc4200782a0 } ``` Again, no difference for the **omitempty** directive. The only difference is the type of the variable: strings get their string value, while string pointers are populated with the address of the string. Now let's see what happens when we stringify these structs back. ##### All properties empty (stringified) ```json { "exVal": "", "exPtr": null, "exValObj": { "exVal": "", "exPtr":null }, "omValObj": { "exVal":"", "exPtr":null }, "exPtrObj":null } ``` This is more interesting. In this case, we are missing three properties: `omVal`, `omPtr`, and `omPtrObj`. These properties were not included in the string for two reasons: they were either null or an empty primitive, and they had the directive **omitempty**. Sometimes we want to return **null** values from APIs, and using pointers in this case seems to be a good way to do it. We can return a regular value from it by assigning a value to a pointer, or assigning **nil** to it in order to send a **null** value. ##### All properties filled (stringified) ```json { "exVal":"string 1", "omVal":"string 2", "exPtr":"string 3", "omPtr":"string 4", "exValObj": { "exVal":"string 1", "omVal":"string 2", "exPtr":"string 3", "omPtr":"string 4" }, "omValObj":{ "exVal":"string 1", "omVal":"string 2", "exPtr":"string 3", "omPtr":"string 4" }, "exPtrObj":{ "exVal":"string 1", "omVal":"string 2", "exPtr":"string 3", "omPtr":"string 4" }, "omPtrObj":{ "exVal":"string 1", "omVal":"string 2", "exPtr":"string 3", "omPtr":"string 4" } } ``` No big surprises here. The only thing I found notable is that the `Marshal` function was smart enough to dereference the pointers, so their real values (and not addresses) were printed! # Unstructured data Go handles JSON structured data really well, but how about unstructured? Let's say I have this payload here: ```json { "portugal": "lisbon", "spain": "madrid", "france": "paris", "germany": "berlin", "netherlands": "amsterdam" } ``` This file contains the capital of a few european countries. Of course, it could contain completely different countries. How would we Marshal/Unmarshal this? The problem here is that we can not use Structs for this case, since we don't have a predefined structure. What we can do is treat the object as a key-value map. Keys, in this case, would be strings, as well as the values. So instead of recording the parsed data into a struct, we can record it into a `map[string]string`! Here is the code: ```go package main import ( "encoding/json" "fmt" ) func main() { unstructuredData := []byte(` { "portugal": "lisbon", "spain": "madrid", "france": "paris", "germany": "berlin", "netherlands": "amsterdam" }`) var parsedData map[string]string json.Unmarshal(unstructuredData, &parsedData) // Resulting map: map[portugal:lisbon spain:madrid france:paris germany:berlin netherlands:amsterdam] fmt.Printf("Resulting map: %+v\n", parsedData) // Here is how we can access a property! // "The capital of Portugal is lisbon" fmt.Printf("The capital of Portugal is %s\n", parsedData["portugal"]) stringifiedData, _ := json.Marshal(&parsedData) // {"france":"paris","germany":"berlin", "netherlands":"amsterdam", // "portugal":"lisbon","spain":"madrid"} fmt.Printf("%s\n", stringifiedData) } ``` But let's say that our JSON is a little more complex: ```json { "countries": { "portugal": "lisbon", "spain": "madrid" }, "cities": [ "oslo", "london", "milan" ] } ``` How can we parse this? In this case, we are still dealing with a map with *string* keys, but the values are now of any type. We can use `interface{}` (empty interface) to identify this type: ```go var parsedData map[string]interface{} json.Unmarshal(unstructuredData, &parsedData) // Resulting map: map[countries:map[portugal:lisbon spain:madrid] cities:[oslo london milan]] fmt.Printf("Resulting map: %+v\n", parsedData) ``` Now let's go down one level and get a capital of a country again. Since we said that the value types are `interface{}`, we can't use the type `map[string]string` anymore. Instead, we will need to use `map[string]interface{}`: ```go countriesData := parsedData["countries"].(map[string]interface{}) // The capital of Spain is madrid fmt.Printf("The capital of Spain is %s\n", countriesData["spain"]) ``` Another example, now casting the list of cities into an array of interfaces and then printing them: ```go // oslo // london // milan for _, city := range parsedData["cities"].([]interface{}) { fmt.Println(city) } ``` Now let's finish by stringifying it back and printing it: ```go stringifiedData, _ := json.Marshal(&parsedData) // {"cities":["oslo","london","milan"],"countries":{"portugal":"lisbon","spain":"madrid"}} fmt.Printf("%s\n", stringifiedData) ``` # Edge cases - **What happens if my JSON string has a property not mapped in the struct?** It gets ignored. - **What happens if I try to access something that doesn't exist in the maps for unstructured data?** If you just try reading its value, you get an empty value back. If you try using it as a list or an object, the program *panics*. You just need to check if the property exists or not before accessing it. --- I must say that I am very impressed by how Go handles JSON. The type system is very simple, yet robust and powerful. We were able to parse structured and unstructured data very easily, and we easily converted it back into strings.