Things to note:
1) supply a JSON schema in `config.reponse_schema`
2) set the `config.response_type` to `application/json`
That works for me reliably. I've had some issues with running into max_token constraints but that was usually on me because I had let it process a large list in one inference call, which would have resulted in very large outputs.
We're using gemini JSON mode in production applications with both `google-generativeai` and `langchain` without issues.
That works for me reliably. I've had some issues with running into max_token constraints but that was usually on me because I had let it process a large list in one inference call, which would have resulted in very large outputs.
We're using gemini JSON mode in production applications with both `google-generativeai` and `langchain` without issues.