Python 2.7-使用字典从文本文件中查找并替换为新的文本文件

问题内容：

我是编程的新手，并且在过去的几个月中一直在业余时间学习python。我决定要尝试创建一个小的脚本，将文本文件中的美国拼写转换为英语拼写。

在过去的5个小时里，我一直在尝试各种事情，但最终想出了一些可以使我更加接近目标的东西，但还远远没有达到目标！

#imported dictionary contains 1800 english:american spelling key:value pairs. 
from english_american_dictionary import dict


def replace_all(text, dict):
    for english, american in dict.iteritems():
        text = text.replace(american, english)
    return text


my_text = open('test_file.txt', 'r')

for line in my_text:
    new_line = replace_all(line, dict)
    output = open('output_test_file.txt', 'a')
    print >> output, new_line

output.close()

我确信有更好的处理方法，但是对于此脚本，这是我遇到的问题：

在输出文件中，这些行每隔一行写入一行，并且之间有换行符，但是原始的test_file.txt没有此行。test_file.txt的内容显示在此页面底部
仅将一行中的美国拼写的第一个实例转换为英语。
我并不是很想在附加模式下打开输出文件，但是无法在此代码结构中找出“ r”。

任何对此急切的新人表示赞赏的帮助！

test_file.txt的内容为：

I am sample file.
I contain an english spelling: colour.
3 american spellings on 1 line: color, analyze, utilize.
1 american spelling on 1 line: familiarize.

问题答案：

您看到的多余空白行是因为您print要写出末尾已经包含换行符的行。由于也print编写了自己的换行符，因此您的输出将变成双倍行距。一个简单的解决方法是使用outfile.write(new_line)。

至于文件模式，问题在于您要一遍又一遍地打开输出文件。一开始，您只需要打开一次即可。使用with语句来处理打开的文件通常是一个好主意，因为当您使用它们时，它们会为您关闭它们。

我不理解您的其他问题，仅发生了一些替换。是您的字典中失踪的拼写'analyze'和'utilize'？

我建议的一个建议是不要逐行更换。您可以一次读取整个文件file.read()，然后将其作为一个单元进行处理。这可能会更快，因为它不需要在拼写字典中的项目上循环那么频繁（只需循环一次，而不是每行一次）：

with open('test_file.txt', 'r') as in_file:
    text = in_file.read()

with open('output_test_file.txt', 'w') as out_file:
    out_file.write(replace_all(text, spelling_dict))

编辑：

为了使您的代码正确处理包含其他单词的单词（例如包含“ tire”的“ entre”），您可能需要放弃使用str.replace正则表达式的简单方法。

这是一个快速拼凑的解决方案，它使用re.sub，给出了从美式到英式英语的拼写变化字典（即，与您当前字典相反的顺序）：

import re

#from english_american_dictionary import ame_to_bre_spellings
ame_to_bre_spellings = {'tire':'tyre', 'color':'colour', 'utilize':'utilise'}

def replacer_factory(spelling_dict):
    def replacer(match):
        word = match.group()
        return spelling_dict.get(word, word)
    return replacer

def ame_to_bre(text):
    pattern = r'\b\w+\b'  # this pattern matches whole words only
    replacer = replacer_factory(ame_to_bre_spellings)
    return re.sub(pattern, replacer, text)

def main():
    #with open('test_file.txt') as in_file:
    #    text = in_file.read()
    text = 'foo color, entire, utilize'

    #with open('output_test_file.txt', 'w') as out_file:
    #    out_file.write(ame_to_bre(text))
    print(ame_to_bre(text))

if __name__ == '__main__':
    main()

关于此代码结构的一个好处是，如果您以其他顺序将字典传递给replacer_factory函数，则可以轻松地将英式英语拼写转换回美式英语拼写。

Python 2.7-使用字典从文本文件中查找并替换为新的文本文件

微信关注