First of all, I am a complete newbie in regard to data science and I am not asking for the complete solution but some guidance as to what I should read up to achieve my task (what algorithms, techniques etc are used to tackle similar problems).
I have different lists of strings which contain one or two useful pieces of information I would like to extract. In the following is an example I need to extract the bold and italic part from each line. This is just an example though, eventually I will need to end up with a process I can apply to different lists with different context. Here's a small sample from a list of 500:
- 50" Sony KDL 50W756CSAEP Smart LED Full HD
- 55" Samsung UE55JU6400 Smart LED HD
- LG 55LF652V 55" SMART 3D FULL HD
- HITACHI 55HGW69 55'' LED ULTRA SMART WIFI
- TV 65" SAMSUNG UE65KS7500 4K LED Smart
In my full list I have already manually extracted the brand and model. So what I need now is a way to automate the process for a new list containing more brands and models. I thought I could go about this heuristically but since I am not just doing this for this type of data it won't work well.
So can someone give me some suggestions on a good way to go about it?
Thanks!