AI-generation not working with uploaded pdf files

bud · December 19, 2024, 3:04pm

My current build:
Baserow 1.29 - self-hosted through a Linode, with docker containers managed by Easypanel

I got AI working great with gtp-4o-mini via the environment variables and my API key, however I’ve run into an issue when trying to analyze pdf files uploaded.

Specifically, I’d like to analyze the owner’s manuals of equipment, so I’m attempting to upload the PDF manuals to a “File” field, then utilize the “AI prompt” field to pull whatever information I want.

When I attempt to “Generate”, the field loads/spins for a few seconds before showing an API_key error, however when running AI with text-only prompts, it works fine. Seems that the files are the issue for some reason.

petrs · December 19, 2024, 3:39pm

Hi @bud,

did you pick the file field on the AI prompt field form? How big are the files?

bud · December 19, 2024, 3:52pm

Yes, I have selected the field in the “Files” dropdown within the AI prompt field. The files are typically 1MB - 5MB max.

I am not referencing the file field directly in the prompt, however, since I assumed I wouldn’t need to as it’s referenced in the dropdown… would this be a cause of the issue?

bud · December 19, 2024, 3:55pm

Here’s a screenshot of the current setup.

petrs · December 20, 2024, 10:06am

You are correct - selecting the file field in that form should be enough, you don’t need to reference it in the text.

Could you please examine the logs of your Baserow instance, specifically where the Celery export worker runs? I think you should see some error message there.

bud · December 20, 2024, 3:05pm

Thanks for your message, I’ve since upgraded to 1.30, and have adjusted the num of workers to 4 in my environment variables, which didn’t change the outcome.

Below is the only error I’m seeing from the celery workers when attempting to run the ai prompt:

[EXPORT_WORKER][2024-12-20 15:02:01] TypeError: Messages.create() got an unexpected keyword argument 'temperature'

bud · December 20, 2024, 3:10pm

Update: ran it again and got more logs:

[EXPORT_WORKER][2024-12-20 15:06:13] [2024-12-20 15:06:12,693: INFO/MainProcess] Task baserow_premium.fields.tasks.generate_ai_values_for_rows[5db7d1df-567a-41f8-9755-2ffbdaf245e0] received
[EXPORT_WORKER][2024-12-20 15:06:13] [2024-12-20 15:06:13,427: INFO/ForkPoolWorker-4] HTTP Request: POST https://api.openai.com/v1/files “HTTP/1.1 200 OK”
[EXPORT_WORKER][2024-12-20 15:06:14] [2024-12-20 15:06:13,901: INFO/ForkPoolWorker-4] HTTP Request: POST https://api.openai.com/v1/assistants “HTTP/1.1 200 OK”
[EXPORT_WORKER][2024-12-20 15:06:14] [2024-12-20 15:06:14,284: INFO/ForkPoolWorker-4] HTTP Request: POST https://api.openai.com/v1/threads “HTTP/1.1 200 OK”
[EXPORT_WORKER][2024-12-20 15:06:15] [2024-12-20 15:06:14,741: INFO/ForkPoolWorker-4] HTTP Request: DELETE https://api.openai.com/v1/threads/thread_znZBnDubF7hvaZaM9srUMrCn “HTTP/1.1 200 OK”
[EXPORT_WORKER][2024-12-20 15:06:15] [2024-12-20 15:06:15,060: INFO/ForkPoolWorker-4] HTTP Request: DELETE https://api.openai.com/v1/assistants/asst_fB6Ug0QhEz4NPhYw8QzzxjLr “HTTP/1.1 200 OK”
[CELERY_WORKER][2024-12-20 15:06:15] [2024-12-20 15:06:09,018: INFO/ForkPoolWorker-4] Task baserow.ws.tasks.broadcast_to_channel_group[111b9863-1708-47df-af46-128cd6d9ddbe] succeeded in 0.004612003453075886s: None
[EXPORT_WORKER][2024-12-20 15:06:15] [2024-12-20 15:06:15,559: INFO/ForkPoolWorker-4] HTTP Request: DELETE https://api.openai.com/v1/files/file-DJUv8rmjemXZwC5kyL1GFo “HTTP/1.1 200 OK”
[EXPORT_WORKER][2024-12-20 15:06:15] [2024-12-20 15:06:15,563: ERROR/ForkPoolWorker-4] Task baserow_premium.fields.tasks.generate_ai_values_for_rows[5db7d1df-567a-41f8-9755-2ffbdaf245e0] raised unexpected: TypeError(“Messages.create() got an unexpected keyword argument ‘temperature’”)
[EXPORT_WORKER][2024-12-20 15:06:15] Traceback (most recent call last):
[EXPORT_WORKER][2024-12-20 15:06:15] File “/baserow/venv/lib/python3.11/site-packages/celery/app/trace.py”, line 453, in trace_task
[EXPORT_WORKER][2024-12-20 15:06:15] R = retval = fun(*args, **kwargs)
[EXPORT_WORKER][2024-12-20 15:06:15] ^^^^^^^^^^^^^^^^^^^^
[EXPORT_WORKER][2024-12-20 15:06:15] File “/baserow/venv/lib/python3.11/site-packages/celery/app/trace.py”, line 736, in protected_call
[EXPORT_WORKER][2024-12-20 15:06:15] return self.run(*args, **kwargs)
[EXPORT_WORKER][2024-12-20 15:06:15] ^^^^^^^^^^^^^^^^^^^^^^^^^
[EXPORT_WORKER][2024-12-20 15:06:15] File “/baserow/premium/backend/src/baserow_premium/fields/tasks.py”, line 132, in generate_ai_values_for_rows
[EXPORT_WORKER][2024-12-20 15:06:15] raise exc
[EXPORT_WORKER][2024-12-20 15:06:15] File “/baserow/premium/backend/src/baserow_premium/fields/tasks.py”, line 106, in generate_ai_values_for_rows
[EXPORT_WORKER][2024-12-20 15:06:15] raise exc
[EXPORT_WORKER][2024-12-20 15:06:15] File “/baserow/premium/backend/src/baserow_premium/fields/tasks.py”, line 98, in generate_ai_values_for_rows
[EXPORT_WORKER][2024-12-20 15:06:15] value = generative_ai_model_type.prompt_with_files(
[EXPORT_WORKER][2024-12-20 15:06:15] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[EXPORT_WORKER][2024-12-20 15:06:15] File “/baserow/backend/src/baserow/core/generative_ai/generative_ai_model_types.py”, line 135, in prompt_with_files
[EXPORT_WORKER][2024-12-20 15:06:15] message = client.beta.threads.messages.create(
[EXPORT_WORKER][2024-12-20 15:06:15] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[CELERY_WORKER][2024-12-20 15:06:15] [2024-12-20 15:06:15,561: INFO/MainProcess] Task baserow.ws.tasks.broadcast_to_channel_group[b214b9b4-d29c-46a2-8fec-eeb4936b3b65] received

petrs · December 20, 2024, 4:22pm

[EXPORT_WORKER][2024-12-20 15:02:01] TypeError: Messages.create() got an unexpected keyword argument 'temperature'

Yes, this is the problem. You can try setting temperature to 1 but I suspect that won’t work either.

It seems that Baserow can’t be used with this model at this time, we could open a ticket to support it. Try to use the file field with gpt-3.5-turbo or gpt-4-turbo-preview models. These should work.

bud · December 20, 2024, 5:22pm

it’s odd because as soon as I set the “File” to none, the prompt will work based on info fed to it in the prompt text. Seems to be just the file that’s messing with it. Temp, as you suggested, didn’t change it.

I’ll attempt to include a few other models and see what works.

Thanks for your help.

Edit: I have tried gpt-3.5-turbo to no avail, I may try to link an additional non-openai api when i have time next.

petrs · December 28, 2024, 1:42pm

Do I understand it correctly that you didn’t manage to use the file field with ANY openai models?

bud · December 30, 2024, 4:21pm

Correct, I have not been able to get any OpenAI models to work for file-reading so far. They work great for text-only but not for files.