Ingestion <id>.json.gz with 1m records

Hi all
Looking for some suggestions.
I have a .json.gz file inbound, comprised out of an array, 1M records/documents.
I need to decompose the array into it’s individual documents, was looking at awswranger, but that seem to.
#1 unpack into a panda array, as such i loose my json structure and
#2 i don’t see how awswrangler can work with the .gz file inline/natively.

Suggestions please.

so solution ended as a lambda based function on python.
inline unzip (gz), readline and post to confluent kafka cluster.
I achieved 7000tps, (achieved 5500 on my M1 Mac :smile: )

busy doing a golang version, initial numbers look like it will be north of 10 000tps

  • sitting with error though when trying to build the golang binary.
    see another thread here.