Questions about segmentation and model training in the Arkindex demo

Hi Arkindex team,

I am a master’s student in Digital Humanities and currently exploring the Arkindex demo environment, and I have encountered several issues:

  1. Segmentation model training
    When following the official tutorial (“Training a Segmentation Model”), the task failed at the dataset locking step.
    I would like to ask whether segmentation model training is supported in the demo version.

  2. YOLO Segmenter availability
    In the tutorial video, I saw the use of a YOLO Segmenter for automated page segmentation.
    However, in my demo workspace, I only have access to “(Object detector) YOLO Inference”, “(Image classifier) YOLO training”, and "(Object detector) YOLO training”, but not the YOLO Segmenter.
    Is the segmenter module available in the demo, or is it limited to full installations?

Thank you for your help!

Best,
Jiaqi

Hello @jiaqi ,

Could you post links towards your failed training process and the steps you took to create it ?

The “(Object detector) Yolo Inference” worker is the “Yolo segmenter” mentionned in the tutorial video. We need to update that video to showcase that name change !

Hello, thank you for your help!

Here are the links:
Project
Locking the dataset

Please let me know if you need any other information!

I’ve tried many times so the project might look a bit messy. If it helps, I can also start a new project and try again.

Thank you!

I updated your dataset process to use the most recent version of that worker (0.3.0), as the one mentionned was outdated (the docker image was no longer available).

The dataset is currently building in Arkindex 1.9.3

Hello again,

I am a Master’s student in Digital Humanities at Uppsala University, currently working on an internship project involving French historical patent documents. As part of the work, my supervisor and I need to transcribe historical handwritten French text from INPI’s scanned patent files.

I have two questions:

1. Does Arkindex currently have any HTR models for French handwritten documents?
I wasn’t able to find any HTR-related models or workers in the demo environment, so I wanted to check whether such models exist.

2. If not, are there any plans to develop or collaborate on French HTR models?
Our team would be very interested in exploring this possibility and would be happy to collaborate, assist with testing, or help develop such models together. If possible, could you also share a contact email for further discussion?

Thank you very much!

PS: If needed, I can share more details about the project and my supervisor privately, since this is a public forum.

Dear Jiaqi,

you can contact me at kermorvant @ teklia.com to discuss your project.

Best regards,


Christopher