Warning in Generic Training Dataset Extractor in demo

Whenever a dataset is created using the Generic Training Dataset Extractor, the following warning appears: ‘WARNING/arkindex_worker: This API helper update_dataset_state did not update the cache database.’
Additionally, fine-tuning training in PyLaia on this dataset is unable to achieve a better result than val_cer= 1. This behaviour began after the dataset name ‘val’ was changed to ‘dev’.
Can you help us solving this problem?
Best regards
Irene

Hi @ipruevora

Thanks for trying out Arkindex.
Can you please share any links that may help us troubleshoot? Any links to your processes and your dataset for instance?


Yoann Schneider

Dear Yoann Schneider,

The identifiers for the project, processes, and dataset are below. I hope this information is sufficient.

Best regards

Irene Rodrigues

Process to complete the dataset:

  • in project id: 8ba3417f-e7be-4e27-83e9-767452bbbd52 (DemoPlus1)
  • Dataset id : a1c30ab7-7a7d-4059-88d8-3d3b61574dfb (‘Dataset Docs 80 + 10 t’)
  • Process id : b4a66bc9-614d-4709-b692-0b8e70cf3ee2 (generic-training-dataset_a1221a)

Process to fine tune

  • in project id: 8ba3417f-e7be-4e27-83e9-767452bbbd52 (DemoPlus1)
  • in Dataset id : a1c30ab7-7a7d-4059-88d8-3d3b61574dfb (‘Dataset Docs 80 + 10 t’)
  • Process id: ebf72448-3d59-4e99-82ce-c6040c6ca30e ( worker PyLaia Training ,
    Model PyLaia | Norwegian Hugin Munin,
    configuration id: e85ea043-0c77-4415-a42a-1a3acd6301f2. (Configuração Textos 62 (NOVA))