Beginning Python Visualization: Crafting Visual Transformation Scripts

Page 148

C H A P T E R 4 N D A T A O R G A N I Z A T I O N

Method 1: We use the file name as the unique key in our dictionary, iu`e_p-. The value is a list of Wbehal]pd( behaoevaY. At first, iu`e_p- is empty. For every entry, we ask whether the file name is a key to the dictionary. If it wasn’t encountered, we add the list Wbehal]pd( behaoevaY as a value to the key, file name. If the key is in the dictionary, it means that this file name has been encountered in the past. We then retrieve the file size and compare it with the current entry file size. Listing 4-9 shows the implementation. Listing 4-9. Looking for Duplicate Files, Method 1 `ab bej`[`qlao[-$pdabehao%6 Oa]n_dao bkn beha `qlhe_]pao( iapdk` -* naoqhp- 9 WY iu`e_p- 9 `e_p$% bkn behaj]ia( l]pdj]ia( behaoeva ej pdabehao6 eb behaj]ia ej iu`e_p-6 W`ql[beha( `ql[oevaY 9 iu`e_p-Wbehaj]iaY eb `ql[oeva 99 behaoeva6 naoqhp-*]llaj`$l]pdj]ia% ahoa6 iu`e_p-Wbehaj]iaY 9 Wl]pdj]ia( behaoevaY napqnj naoqhpOne of the obvious shortcomings of this method is that there might be several files with the same file name but different sizes; the algorithm might not catch some of them. For example, if the first file is of size A, and several other files have the same file name but are of size B, the algorithm will not identify files of size B as duplicates. Method 2: This method uses the path name as the unique key in the dictionary iu`e_p. and the list Wbehaj]ia( behaoevaY as the value. Since we’re using the path name as the key, it’s guaranteed to be unique; there are no two files with the same file name and path name. To check whether a file name already exists in the dictionary, we iterate through all the elements in the dictionary using the epanepaio$% method. If the file name and the file size are identical, we announce them to be a duplicate. If not, we add the associated path name as key and the Wbehaj]ia( behaoevaY as a new value to the dictionary (see Listing 4-10). Listing 4-10. Looking for Duplicate Files, Method 2 `ab bej`[`qlao[.$pdabehao%6 Oa]n_dao bkn beha `qlhe_]pao( iapdk` .* naoqhp. 9 WY iu`e_p. 9 `e_p$% bkn behaj]ia( l]pdj]ia( behaoeva ej pdabehao6 bkn g( r ej iu`e_p.*epanepaio$%6 eb rW,Y 99 behaj]ia ]j` rW-Y 99 behaoeva6 naoqhp.*]llaj`$l]pdj]ia% ahoa6 iu`e_p.Wl]pdj]iaY 9 Wbehaj]ia( behaoevaY napqnj naoqhp.

129


Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.