Skip to main content Skip to complementary content

Kafka JSON schema and limitations

When creating a Kafka dataset, you have the possibility to enter a custom JSON schema which is then used when reading/writing from the selected topic.

Caveats for working with JSON and Kafka input

The current implementation of JSON support in Kafka works as follows:

  • The schema is inferred from the first JSON record, this schema is then used to convert subsequent JSON records.
  • If a JSON record does not match the inferred JSON schema, it is dropped silently (with a debug message).
Example of a Kafka topic with the following JSON records:
1 - {"title":"The Matrix","year":1999,"cast":["Keanu Reeves","Laurence Fishburne","Carrie-Anne Moss","Hugo Weaving","Joe Pantoliano"],"genres":["Science Fiction"]}
2 - {"Test" : true}
3 - {"title":"Toy Story","year":1995,"cast":["Tim Allen","Tom Hanks","(voices)"],"genres":["Animated"]}
The Kafka input connector will handle the messages like this:
  • Infer the schema from the first incoming JSON record (message number 1).
  • Forward message number 1 to the next connector.
  • Drop message number 2 as it does not match the inferred schema.
  • Forward message number 3 to the next connector as it matches the inferred schema.

Caveats for working with JSON and Kafka output

The Kafka output connector cannot handle properly the Bytes type.

Did this page help you?

If you find any issues with this page or its content – a typo, a missing step, or a technical error – let us know how we can improve!