单例数组array(,dtype = object)不能视为有效集合


问题内容

不知道如何解决。任何帮助,不胜感激。我看到了向量化:不是有效的集合,但是不确定我是否理解这一点

train = df1.iloc[:,[4,6]]
target =df1.iloc[:,[0]]

def train(classifier, X, y):
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state=33)
    classifier.fit(X_train, y_train)
    print ("Accuracy: %s" % classifier.score(X_test, y_test))
    return classifier

trial1 = Pipeline([
         ('vectorizer', TfidfVectorizer()),
         ('classifier', MultinomialNB()),])

train(trial1, train, target)

错误如下:

    ----> 6 train(trial1, train, target)

    <ipython-input-140-ac0e8d32795e> in train(classifier, X, y)
          1 def train(classifier, X, y):
    ----> 2     X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state=33)
          3 
          4     classifier.fit(X_train, y_train)
          5     print ("Accuracy: %s" % classifier.score(X_test, y_test))

    /home/manisha/anaconda3/lib/python3.5/site-packages/sklearn/model_selection/_split.py in train_test_split(*arrays, **options)
       1687         test_size = 0.25
       1688 
    -> 1689     arrays = indexable(*arrays)
       1690 
       1691     if stratify is not None:

    /home/manisha/anaconda3/lib/python3.5/site-packages/sklearn/utils/validation.py in indexable(*iterables)
        204         else:
        205             result.append(np.array(X))
    --> 206     check_consistent_length(*result)
        207     return result
        208

    /home/manisha/anaconda3/lib/python3.5/site-packages/sklearn/utils/validation.py in check_consistent_length(*arrays)
        175     """
        176 
    --> 177     lengths = [_num_samples(X) for X in arrays if X is not None]
        178     uniques = np.unique(lengths)
        179     if len(uniques) > 1:

    /home/manisha/anaconda3/lib/python3.5/site-packages/sklearn/utils/validation.py in <listcomp>(.0)
        175     """
        176 
    --> 177     lengths = [_num_samples(X) for X in arrays if X is not None]
        178     uniques = np.unique(lengths)
        179     if len(uniques) > 1:

    /home/manisha/anaconda3/lib/python3.5/site-packages/sklearn/utils/validation.py in _num_samples(x)
        124         if len(x.shape) == 0:
        125             raise TypeError("Singleton array %r cannot be considered"
    --> 126                             " a valid collection." % x)
        127         return x.shape[0]
        128     else:

    TypeError: Singleton array array(<function train at 0x7f3a311320d0>, dtype=object) cannot be considered a valid collection.

 ____

不知道如何解决。任何帮助,不胜感激。我看到了向量化:不是有效的集合,但是不确定我是否理解这一点


问题答案:

发生此错误的原因是您的函数train掩盖了您的变量train,因此将其传递给自己。

说明

您可以这样定义变量火车:

train = df1.iloc[:,[4,6]]

然后,在几行之后,您将定义一个方法序列,如下所示:

def train(classifier, X, y):

因此实际发生的是,您先前的版本train已更新为新版本。这意味着trainnow并不会指向您想要的Dataframe对象,而是指向您定义的函数。在错误中将其清除。

array(<function train at 0x7f3a311320d0>, dtype=object)

请参见错误语句中的 功能列表

解决方案

重命名其中之一(变量或方法)。 建议 :将函数重命名为诸如training或之training_func类的其他名称。