gpt4 book ai didi

list - mapMaybes/catMaybe 在 Windows 和 Ubuntu 中的工作方式不同

转载 作者:行者123 更新时间:2023-12-04 18:35:29 25 4
gpt4 key购买 nike

我有一个 Haskell 程序,它读取输入文件的内容并对其进行解析以排序和删除重复项。这个程序已经休眠了一段时间了,我需要复活它。我告诉你这只是为了解决这个问题的一些历史背景。

当我重新启用该程序时,我发现它无法正常工作。我的调试已将问题与解析和“清理”输入文件的代码隔离开来。在此之后发生的事情对这个问题无关紧要,因为我最终得到了来自输入文件的候选记录的空列表。

我在我的 Windows 笔记本电脑上编写和测试这个程序,然后在需要运行的 Ubuntu 服务器上部署和构建源代码。作为调试的一部分,我将文本解析分解为几个隐蔽的步骤,在最后一步的输出中运行 catMaybe 的部分是我得到空列表的地方,但只有当我在 Ubuntu 服务器上运行它时。

这是演示问题的主要来源:

    main = do
[ inFileName ] <- getArgs
sFile <- readFile inFileName
let lrec = lines sFile
putStrLn $ "Number of lines read from the file: " ++ show (length lrec)
let prec = map processLine lrec
putStrLn $ "Number of processed lines is " ++ show (length prec)
-- let persons = mapMaybe processLine lrec
let persons = catMaybes prec
putStrLn $ "Number of filtered person records: " ++ show (length persons)
let records = sortBy (compare `on` personEmployeeID) persons
putStrLn $ "Number of records read and sorted is " ++ show (length records)

{-
Compare and warn about employees with duplicate records.
-}
let srec = groupBy ((==) `on` personEmployeeID) records
putStrLn $ "Number of unique record groups is " ++ show (length srec)
let dups = map (personEmployeeID . head) $ filter ((> 1) . length) srec
putStrLn $ "Number of dups: " ++ show (length dups)
unless (null dups) $ putStrLn $ "WARNING: Duplicate employees: " ++ show dups

-- Remove the duplicates
let cleanedRecords = map head srec
putStrLn $ "Number of records in cleanedRecords is " ++ show (length cleanedRecords)

正如您可能从注释行中注意到的那样,我使用 mapMaybe 代替 catMaybes 进行了尝试,结果没有任何变化。这是 processLine 方法的代码,其中注释显示了输入记录的格式:
    {-
Splits a line of the input file into fields. The format includes 11 columns,
separated by semicolons. The 10th columns is required to be 'A' or 'S',
indicating the user is active or short-term; otherwise we ignore that line.

Sample Line:
------------------------------------------------------------------------------------------------------------------------------------------------
99XXXXX17;MXXX ;TXXXXX ;MIXXXXXX ;RAA CBP;RAA;19910929;19910929;19910929;A; ;
------------------------------------------------------------------------------------------------------------------------------------------------
emp id ;first name ;middle name ;last name ;loc code;dpt;hiredate;servdate;statdate;s;note ;
------------------------------------------------------------------------------------------------------------------------------------------------

* s = status
-}
processLine :: String -> Maybe Person
processLine line =
let (_ :: String, _ :: String, _ :: String, result) =
line =~ "^(.+);(.+);(.+);(.+);(.+);(.+);(.+);(.+);(.+);(A|S);(.+);$"
in case result of
[empid, fname, mname, lname, lcode, dept, hdate, srvdate, stdate, status, note]
-> Just $ Person empid (trim fname) (trim mname) (trim lname)
(trim lcode) dept hdate srvdate stdate (readStatus status) (trim note)
_ -> Nothing

当我在 Windows 笔记本电脑上运行此代码时,它会产生以下输出:
    Number of lines read from the file: 47793
Number of processed lines is 47793
Number of filtered person records: 32993
Number of records read and sorted is 32993
Number of unique record groups is 32949
Number of dups: 44
WARNING: Duplicate employees: [ {List removed for privacy } ]
Number of records in cleanedRecords is 32949
C:>cabal --version
cabal-install version 1.22.4.0
using version 1.22.3.0 of the Cabal library
C:>ghc --version
The Glorious Glasgow Haskell Compilation System, version 7.8.3

当我在两个不同的 Ubuntu 服务器中的任何一个上对相同的输入文件运行相同的代码时,每个服务器都有不同版本的 Ubuntu 和 Haskell,我得到以下输出:
    Number of lines read from the file: 47793
Number of processed lines is 47793
Number of filtered person records: 0
Number of records read and sorted is 0
Number of unique record groups is 0
Number of dups: 0
Number of records in cleanedRecords is 0
xx:~/$ cabal --version
cabal-install version 0.14.0
using version 1.14.0 of the Cabal library
xx:~/$ ghc --version
The Glorious Glasgow Haskell Compilation System, version 7.4.1

...以及来自另一台 Ubuntu 服务器:
    Number of lines read from the file: 47793
Number of processed lines is 47793
Number of filtered person records: 0
Number of records read and sorted is 0
Number of unique record groups is 0
Number of dups: 0
Number of records in cleanedRecords is 0
yy:~/$ cabal --version
cabal-install version 0.10.2
using version 1.10.2.0 of the Cabal library
yy:~/$ ghc --version
The Glorious Glasgow Haskell Compilation System, version 7.6.1

像往常一样,我很困惑。我准备尝试任何事情。

有任何想法吗?

戴夫

最佳答案

而答案是……

Windows vs Unix 行尾。

我添加了代码来打印前几行输入,并在每行的末尾看到了\r。我通过 dos2unix 运行文件。现在我在 Ubuntu 系统上得到了相同的结果。

感谢您指出输入文件是问题的根源。

关于list - mapMaybes/catMaybe 在 Windows 和 Ubuntu 中的工作方式不同,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/30851379/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com