My question is primarily: is there any ML research paper about splitting a pdf containing a batch of scanned documents (eg bank statements) into individual documents?
I have searched for this but I have not found any relevant research paper or any application in general mentioned on the Internet.
I would be primarily interested in the feature engineering of these papers/applications but also in general in the whole approach.