创建词典列表会产生同一词典的副本列表

问题内容：

我想iframe从网页上获取所有信息。

码：

site = "http://" + url
f = urllib2.urlopen(site)
web_content =  f.read()

soup = BeautifulSoup(web_content)
info = {}
content = []
for iframe in soup.find_all('iframe'):
    info['src'] = iframe.get('src')
    info['height'] = iframe.get('height')
    info['width'] = iframe.get('width')
    content.append(info)
    print(info)

pprint(content)

的结果print(info)：

{'src': u'abc.com', 'width': u'0', 'height': u'0'}
{'src': u'xyz.com', 'width': u'0', 'height': u'0'}
{'src': u'http://www.detik.com', 'width': u'1000', 'height': u'600'}

的结果pprint(content)：

[{'height': u'600', 'src': u'http://www.detik.com', 'width': u'1000'},
{'height': u'600', 'src': u'http://www.detik.com', 'width': u'1000'},
{'height': u'600', 'src': u'http://www.detik.com', 'width': u'1000'}]

为什么内容的价值不正确？假定与I时的值相同print(info)。

问题答案：

您并没有为每个iframe创建单独的字典，只是不断地修改同一字典，并在列表中继续添加对该字典的引用。

请记住，当您执行诸如之类的操作时content.append(info)，您并没有在复制数据，只是在数据上附加了引用。

您需要为每个iframe创建一个新的词典。

for iframe in soup.find_all('iframe'):
   info = {}
    ...

更好的是，您不需要先创建一个空字典。只需一次创建所有内容：

for iframe in soup.find_all('iframe'):
    info = {
        "src":    iframe.get('src'),
        "height": iframe.get('height'),
        "width":  iframe.get('width'),
    }
    content.append(info)

还有其他方法可以完成此操作，例如遍历属性列表或使用列表或字典理解，但是很难提高上述代码的清晰度。

创建词典列表会产生同一词典的副本列表

微信关注