0

I have 10 datasets, each with the same variables (e.g., age and income) but different numbers of observations.

Let us now consider a categorical variable $X$ that can only take values $0$ and $1$ per dataset, meaning that it keeps the same value for all observations. For 5 datasets, $X=0$; for the other 5, $X=1$.

How do I create a regression model for a variable of these datasets (e.g., age) that takes into account this "meta-variable" $X$?

A simple solution would be to append a new column for $X$ to each dataset, where the same value is repeated for all observations, and then concatenate the datasets. However, I think there are better ways.

Pippo
  • 143
  • 2

1 Answers1

0

I believe adding the meta-variable X to you dataset and combining them is a good option. However - why do you have 10 different datasets ? and not a single dataset to start with ?

Is there a special meaning attached to each individual dataset - which is why they are not combined in the first place ? Does that sound like an additional categorical variable you need so as to differentiate between individual datasets ?

Jayaram Iyer
  • 785
  • 5
  • 8