I'm trying to create a machine learning project for the prediction of Pre-Eclampsia (hypertension condition in pregnant women from the 20th week of pregnancy).
I have access to some clinical data of pregnant women with the potential to predict this condition, however, there is no target column with the prediction of this condition (example: a column called “Pre-Eclampsia”, where its rows would have values true or false). And because I'm not a doctor, I don't believe that manually entering this column to use a supervised algorithm is adequate, even if by reading articles, I can get a sense of the conditions that classify someone with this disease.
Therefore, I would have to opt for unsupervised learning and I was recommended to use MeanShift. I started using it, after cleaning and adjusting the dataset, but I would like to be able to identify which instances are in each generated cluster.
As you can see, there are groups of instances and I would like to know what are the attributes of these instances, so that I could find out (perhaps with a doctor) which of these groups have the condition of Pre-Eclampsia.
If you have new ideas and suggestions, please feel free to speak up.
Below, some data in code (Python) of how I'm applying MeanShift.
bandwidth = estimate_bandwidth(data, quantile=0.2)
ms = MeanShift(bandwidth=bandwidth, bin_seeding=True)
ms.fit_predict(data)
labels = ms.labels_
cluster_centers = ms.cluster_centers_
labels_unique = np.unique(labels)
n_clusters_ = len(labels_unique)
print(f'Número estimado de clusters: {n_clusters_}')
>>> Número estimado de clusters: 8
Thanks in advance for all your help.

