问题描述
def line_count(文件名):
for filename in os.walk(os.path.abspath('my directory filename')):
lines = 0
with open(filename) as file:
lines = len([line for line in file.readlines() if line.strip() != ''])
print lines
def find_big_files(files):
file_sizes = [(line_count(file), file) for file in files]
print sorted(file_sizes, key = lambda file_size: file_size[0], reverse = True)
sorted_files = find_big_files(文件)
不起作用。
1楼
由于您要查找的是LONGEST文件,而不是BIGGEST文件,因此请执行以下操作:
def get_length(file):
len_ = 0
with open(file,'r') as f:
for line in f: len_+=1
return len_
files = [file for file in however_you_build_your_list]
files = sorted(files, key=get_length)
# files[0] is now the longest
# files[-1] is now the shortest
2楼
您是否将空行计为行?
如果是这样,以下内容将为您提供文件中原始换行符的数量:
def line_count(filename):
lines = 0
with open(filename) as file:
lines = len(file.readlines())
return lines
如果不是,请将lines = ...
更改为:
lines = len([line for line in file.readlines() if line.strip() != ''])
因此,其余代码如下所示:
def find_big_files(files):
largest = (0, None)
second_largest = (0, None)
for file in files:
size = line_count(file)
if size > largest[0]:
second_largest = largest
largest = (size, file)
return largest, second_largest
请注意,这确实效率很低,因为它必须打开每个文件并在文件中进行遍历。 因此它是O(文件*计数(文件))。 但是,如果您真的在乎行数,那么,至少对于通用的.txt文件或类似文件,实际上并没有什么好的方法。
如果希望整个列表从大多数行到最少行:
def find_big_files(files):
file_sizes = [(line_count(file), file) for file in files]
return sorted(file_sizes, key = lambda file_size: file_size[0])
将返回(line_count,file_name)元组的列表,并且list [-1]将是最大的,list [-2]将是第二大的,依此类推。
编辑:
OP要求我将整个代码发布在一个可以解决问题的代码块中,因此它是:
def line_count(filename):
lines = 0
with open(filename) as file:
lines = len([line for line in file.readlines() if line.strip() != ''])
return lines
def find_big_files(files):
file_sizes = [(line_count(file), file) for file in files]
return sorted(file_sizes, key = lambda file_size: file_size[0], reverse = True)
从result = file_big_files(files)
的返回将是[(count, filename), ...]
从最大到最小,因此result[0]
将是最大, result[1]
将是第二大,等等。将按照原始顺序显示在文件路径的输入列表中。