0

I am trying to build a resume parser (from PDF to JSON). After extracting text from a pdf as one long string, how would you split the string into different sections like the red lines show. Resumes have different formats and people use different labels for these sections. Is there any machine learning technique that I could look into? Thanks! enter image description here.

E.K.
  • 405
  • 4
  • 6

1 Answers1

1

This is one of the famous implementations for your task. It works well mostly. If you just need such a tool you can use it. However, if you want to develop your own tools you might want to analyze its structure.

It is also able to look for specific skills in the resumes as mentioned here.

As in your requirement, it accepts pdf and also doc and returns json.

Shahriyar Mammadli
  • 1,198
  • 4
  • 15
  • Thanks, this is good but not exactly what I am looking for. I would like to just divide a resume into different sections for now. Please let me know if you have any thoughts. – E.K. Dec 02 '20 at 20:57