0

My teacher did this in class, and I'm wondering is this ok to use .fit_transform with xtest? It shouldn't just be poly.transform(xtest)

Teacher's Code

from sklearn.preprocessing import PolynomialFeatures
poly = PolynomialFeatures(degree=3)
xtrain_poly = poly.fit_transform(xtrain)
xtest_poly = poly.fit_transform(xtest)

As I think it should be:

from sklearn.preprocessing import PolynomialFeatures
poly = PolynomialFeatures(degree=3)
xtrain_poly = poly.fit_transform(xtrain)
xtest_poly = poly.transform(xtest)

As an optional question, what does fit() and transform() do in PolynomialFeatures? transform() scales the data based on some value(s) returned by fit(), such as when using sklearn.preprocessing.StandardScaler?

1 Answers1

2

Technically I don't think there is a difference in the output in the two methods, with the main reason being that fitting the PolynomialFeatures class to data does not save any parameters internally, as is the case for example for the StandardScaler class (see also the source code for PolynomialFeatures). However, I would say that it is still better to use transform instead of fit_transform to be consistent with other transformers from scikit-learn.

Oxbowerce
  • 7,077
  • 2
  • 8
  • 22