In BigQuery, STRUCTS and ARRAYS are like the odd couple of data types. They each have their quirks, but together they make your data look neat and organized.
STRUCTS are like little data bundles, packing several fields (think of them as mini-columns) into a single column. Each field inside the STRUCT has a specific data type. If you want, you can even sneak an ARRAY inside a STRUCT—although that still counts as one field.
On the other hand, ARRAYS are more like collections of items (rows) crammed into a single row, but all elements in the ARRAY need to be of the same type. You can’t have an ARRAY inside another ARRAY (sorry to burst your nesting dreams), but there’s a workaround: you can use an ARRAY of STRUCTS!
How to Use STRUCTS and ARRAYS in BigQuery?
Let’s keep it simple:
1. Use STRUCTS when you’ve got a bunch of fields that belong together—like an address with city, street, and postal code all in one. Structs are the perfect little containers to keep things tidy.
2. Use ARRAYS when you’ve got a list of things, like instruments someone plays or favorite pizza toppings (as long as they’re all the same type of item). Arrays let you store these lists neatly in one row.
3. Combine them with an ARRAY of STRUCTS when you need to track a list of items, each with multiple details, like a list of employee certifications with the date and certification name.
Quick Summary:
STRUCTS are like little organized packets of fields, great for related data. ARRAYS are lists of items, perfect for holding multiple things of the same type. And when you want to mix it up? Go for an ARRAY of STRUCTS—perfect for a list of objects, each with multiple attributes. But don't expect to sort or group them too easily!
P.S.: now you can use the GROUP BY clause and the SELECT DISTINCT clause with the ARRAY and STRUCT data types. This feature is in Preview.
If you work with GA4 to BigQuery exports, be sure to check out my SQL cheat sheet.