将GridSearchCV或任何其他Sklearn对象重定向到文件

问题描述

我希望能够在运行时将gridsearchcv输出保存到文件

gridsearchcv(XGBClassifier(),tuned_parameters,cv=cv,n_jobs=-1,verbose=10)

这是输出示例:

    Fitting 1 folds for each of 200 candidates,totalling 200 fits
    [Parallel(n_jobs=-1)]: Using backend with 4 concurrent workers.
    [CV] colsample_bytree=0.7,learning_rate=0.05,max_depth=4,n_estimators=300,subsample=0.7  
    [CV] colsample_bytree=0.7,subsample=0.7 
score=0.645,total= 6.3min
    [Parallel(n_jobs=-1)]: Done   1 tasks      | elapsed:  6.3min

我设法保存了第一行和并行行,但是无论我做什么,我都无法保存以[CV]开头的行。 我想保存这些行,以便如果程序失败,我至少可以看到部分结果。

我尝试了here

解决方
sys.stdout = open('file','w')

和:

with open('help.txt','w') as f:
    with redirect_stdout(f):
        print('it Now prints to `help.text`')

This解决方案(也指this解决方案)也无效:

class Tee(object):
    def __init__(self,*files):
        self.files = files
    def write(self,obj):
        for f in self.files:
            f.write(obj)
            f.flush() # If you want the output to be visible immediately
    def flush(self) :
    for f in self.files:
        f.flush()

并尝试了monkey-patch的作者名叫它,但也只是保存了“平行”行。

(为了强调,上面的代码只是对所建议解决方案的一瞥,当我尝试它们时,我采用了所有相关代码)。

是否可以保存 ALL 输出

解决方法

我不知道您是否可以使用sys库或其他库来执行此操作。 相反,我建议采用以下方法来正确重定向stdout和stderr。

假设您有一个这样的脚本:

test.py

import numpy as np
from sklearn.model_selection import GridSearchCV
from sklearn.linear_model import LogisticRegression

model = LogisticRegression()
params = {"C": [0.001,0.01,0.1,1,2,3]}
grid = GridSearchCV(model,params,n_jobs=-1,verbose=10)
X = np.random.randn(100,10)
y = np.random.randint(0,100)

grid.fit(X,y)

然后运行:

python test.py > logfile.txt 2>&1

然后,您将在logfile.txt中同时包含“平行”和“ CV”行:

Fitting 5 folds for each of 6 candidates,totalling 30 fits
[Parallel(n_jobs=-1)]: Using backend LokyBackend with 12 concurrent workers.
[Parallel(n_jobs=-1)]: Done   1 tasks      | elapsed:    1.6s
[Parallel(n_jobs=-1)]: Done  11 out of  30 | elapsed:    1.7s remaining:    2.9s
[Parallel(n_jobs=-1)]: Done  15 out of  30 | elapsed:    1.7s remaining:    1.7s
[Parallel(n_jobs=-1)]: Done  19 out of  30 | elapsed:    1.7s remaining:    1.0s
[Parallel(n_jobs=-1)]: Done  23 out of  30 | elapsed:    1.7s remaining:    0.5s
[Parallel(n_jobs=-1)]: Done  27 out of  30 | elapsed:    1.7s remaining:    0.2s
[Parallel(n_jobs=-1)]: Done  30 out of  30 | elapsed:    1.7s finished
[CV] C=0.001 .........................................................
[CV] ............................. C=0.001,score=0.500,total=   0.0s
[CV] C=0.1 ...........................................................
[CV] ............................... C=0.1,score=0.450,score=0.550,total=   0.0s
[CV] C=1 .............................................................
[CV] ................................. C=1,total=   0.0s
[CV] C=2 .............................................................
...

详细信息

“ [CV]”行由print语句产生 (Source)。 这是写到stdout的。

“平行”行由记录器(Source)产生。 这是写给stderr的。

> logfile.txt 2>&1是一种将stdout和stderr都重定向到同一文件(Related question)的技巧。 结果,这两个消息都被写入同一个文件。