在 Windows 下的 Haskell 中,让控制台 I/O 使用 Unicode 字符似乎相当困难。这是悲惨的故事:
SetConsoleCP
win32 命令。 (因此,您需要绑定(bind)到 Win32 库。)或者,如果您不想这样做,您可以期望程序的用户为您更改代码页(然后他们必须在之前调用 chcp
命令他们运行你的程序)。 WriteConsoleW
.在这里,您将 UTF16 直接发送到 windows,这会很好地呈现它:没有编码不匹配的危险,因为 windows 总是希望 UTF16 具有这些功能。 不幸的是,这些选项在 Haskell 中都不能很好地工作。首先,我知道没有使用选项 B 的库,所以这不是很容易。这留下了选项 A。如果您使用 Haskell 的 I/O 库(
putStrLn
等),这就是该库将要做的。在现代版本的 Haskell 中,它会仔细询问 windows 当前的代码页是什么,并以正确的编码输出你的字符串。这种方法有两个问题:chcp cp65001
在他们运行您的程序之前(您可能会发现强制您的用户这样做很令人反感)。或者你需要绑定(bind)到SetConsoleCP
并在您的程序中执行等效操作(然后使用 hSetEncoding
以便 Haskell 库将使用新编码发送输出),这意味着您需要包装 win32 库的相关部分以使它们对 Haskell 可见。 上面列出的错误仍未解决并列为低优先级;那里的基本结论是选项 A(在我上面的分类中)是行不通的,需要切换到选项 B 才能获得可靠的结果。目前尚不清楚解决此问题的时间表,因为它看起来像是一些相当大的工作。
问题是:同时,任何人都可以提出一种解决方法来允许在 Windows 下的 Haskell 中使用 Unicode 控制台 I/O。
另见此 python bug tracker database entry ,在 Python 3 中解决同样的问题(建议修复,但尚未被代码库接受),以及 this stackoverflow answer ,在 Python 中提供了解决此问题的方法(基于我分类中的“选项 B”)。
最佳答案
我想我会回答我自己的问题,并列出一个可能的答案,以下是我目前正在做的事情。很可能一个人可以做得更好,这就是我问这个问题的原因!但我认为向人们提供以下内容是有意义的。它基本上是从 Python 到 Haskell 的翻译 python workaround for the same issue .它使用问题中提到的“选项 B”。
基本思路是创建一个模块IOUtil.hs,内容如下,可以import
进入你的代码:
{-# LANGUAGE ForeignFunctionInterface #-}
{-# LANGUAGE CPP #-}
{-# LANGUAGE NoImplicitPrelude #-}
module IOUtil (
IOUtil.interact,
IOUtil.putChar, IOUtil.putStr, IOUtil.putStrLn, IOUtil.print,
IOUtil.getChar, IOUtil.getLine, IOUtil.getContents, IOUtil.readIO,
IOUtil.readLn,
ePutChar, ePutStr, ePutStrLn, ePrint,
trace, traceIO
) where
#ifdef mingw32_HOST_OS
import System.Win32.Types (BOOL, HANDLE, DWORD, LPDWORD, LPWSTR, LPCWSTR, LPVOID)
import Foreign.C.Types (CWchar)
import Foreign
import Prelude hiding (getContents, putStr, putStrLn) --(IO, Read, Show, String)
--import qualified System.IO
import qualified System.IO (getContents)
import System.IO hiding (getContents, putStr, putStrLn)
import Data.Char (ord)
{- <http://msdn.microsoft.com/en-us/library/ms683231(VS.85).aspx>
HANDLE WINAPI GetStdHandle(DWORD nStdHandle);
returns INVALID_HANDLE_VALUE, NULL, or a valid handle -}
foreign import stdcall unsafe "GetStdHandle" win32GetStdHandle :: DWORD -> IO (HANDLE)
std_OUTPUT_HANDLE = -11 :: DWORD -- all DWORD arithmetic is performed modulo 2^n
std_ERROR_HANDLE = -12 :: DWORD
{- <http://msdn.microsoft.com/en-us/library/aa364960(VS.85).aspx>
DWORD WINAPI GetFileType(HANDLE hFile); -}
foreign import stdcall unsafe "GetFileType" win32GetFileType :: HANDLE -> IO (DWORD)
_FILE_TYPE_CHAR = 0x0002 :: DWORD
_FILE_TYPE_REMOTE = 0x8000 :: DWORD
{- <http://msdn.microsoft.com/en-us/library/ms683167(VS.85).aspx>
BOOL WINAPI GetConsoleMode(HANDLE hConsole, LPDWORD lpMode); -}
foreign import stdcall unsafe "GetConsoleMode" win32GetConsoleMode :: HANDLE -> LPDWORD -> IO (BOOL)
_INVALID_HANDLE_VALUE = (intPtrToPtr $ -1) :: HANDLE
is_a_console :: HANDLE -> IO (Bool)
is_a_console handle
= if (handle == _INVALID_HANDLE_VALUE) then return False
else do ft <- win32GetFileType handle
if ((ft .&. complement _FILE_TYPE_REMOTE) /= _FILE_TYPE_CHAR) then return False
else do ptr <- malloc
cm <- win32GetConsoleMode handle ptr
free ptr
return cm
real_stdout :: IO (Bool)
real_stdout = is_a_console =<< win32GetStdHandle std_OUTPUT_HANDLE
real_stderr :: IO (Bool)
real_stderr = is_a_console =<< win32GetStdHandle std_ERROR_HANDLE
{- BOOL WINAPI WriteConsoleW(HANDLE hOutput, LPWSTR lpBuffer, DWORD nChars,
LPDWORD lpCharsWritten, LPVOID lpReserved); -}
foreign import stdcall unsafe "WriteConsoleW" win32WriteConsoleW
:: HANDLE -> LPWSTR -> DWORD -> LPDWORD -> LPVOID -> IO (BOOL)
data ConsoleInfo = ConsoleInfo Int (Ptr CWchar) (Ptr DWORD) HANDLE
writeConsole :: ConsoleInfo -> [Char] -> IO ()
writeConsole (ConsoleInfo bufsize buf written handle) string
= let fillbuf :: Int -> [Char] -> IO ()
fillbuf i [] = emptybuf buf i []
fillbuf i remain@(first:rest)
| i + 1 < bufsize && ordf <= 0xffff = do pokeElemOff buf i asWord
fillbuf (i+1) rest
| i + 1 < bufsize && ordf > 0xffff = do pokeElemOff buf i word1
pokeElemOff buf (i+1) word2
fillbuf (i+2) rest
| otherwise = emptybuf buf i remain
where ordf = ord first
asWord = fromInteger (toInteger ordf) :: CWchar
sub = ordf - 0x10000
word1' = ((shiftR sub 10) .&. 0x3ff) + 0xD800
word2' = (sub .&. 0x3FF) + 0xDC00
word1 = fromInteger . toInteger $ word1'
word2 = fromInteger . toInteger $ word2'
emptybuf :: (Ptr CWchar) -> Int -> [Char] -> IO ()
emptybuf _ 0 [] = return ()
emptybuf _ 0 remain = fillbuf 0 remain
emptybuf ptr nLeft remain
= do let nLeft' = fromInteger . toInteger $ nLeft
ret <- win32WriteConsoleW handle ptr nLeft' written nullPtr
nWritten <- peek written
let nWritten' = fromInteger . toInteger $ nWritten
if ret && (nWritten > 0)
then emptybuf (ptr `plusPtr` (nWritten' * szWChar)) (nLeft - nWritten') remain
else fail "WriteConsoleW failed.\n"
in fillbuf 0 string
szWChar = sizeOf (0 :: CWchar)
makeConsoleInfo :: DWORD -> Handle -> IO (Either ConsoleInfo Handle)
makeConsoleInfo nStdHandle fallback
= do handle <- win32GetStdHandle nStdHandle
is_console <- is_a_console handle
let bufsize = 10000
if not is_console then return $ Right fallback
else do buf <- mallocBytes (szWChar * bufsize)
written <- malloc
return . Left $ ConsoleInfo bufsize buf written handle
{-# NOINLINE stdoutConsoleInfo #-}
stdoutConsoleInfo :: Either ConsoleInfo Handle
stdoutConsoleInfo = unsafePerformIO $ makeConsoleInfo std_OUTPUT_HANDLE stdout
{-# NOINLINE stderrConsoleInfo #-}
stderrConsoleInfo :: Either ConsoleInfo Handle
stderrConsoleInfo = unsafePerformIO $ makeConsoleInfo std_ERROR_HANDLE stderr
interact :: (String -> String) -> IO ()
interact f = do s <- getContents
putStr (f s)
conPutChar ci = writeConsole ci . replicate 1
conPutStr = writeConsole
conPutStrLn ci = writeConsole ci . ( ++ "\n")
putChar :: Char -> IO ()
putChar = (either conPutChar hPutChar ) stdoutConsoleInfo
putStr :: String -> IO ()
putStr = (either conPutStr hPutStr ) stdoutConsoleInfo
putStrLn :: String -> IO ()
putStrLn = (either conPutStrLn hPutStrLn) stdoutConsoleInfo
print :: Show a => a -> IO ()
print = putStrLn . show
getChar = System.IO.getChar
getLine = System.IO.getLine
getContents = System.IO.getContents
readIO :: Read a => String -> IO a
readIO = System.IO.readIO
readLn :: Read a => IO a
readLn = System.IO.readLn
ePutChar :: Char -> IO ()
ePutChar = (either conPutChar hPutChar ) stderrConsoleInfo
ePutStr :: String -> IO ()
ePutStr = (either conPutStr hPutStr ) stderrConsoleInfo
ePutStrLn :: String -> IO ()
ePutStrLn = (either conPutStrLn hPutStrLn) stderrConsoleInfo
ePrint :: Show a => a -> IO ()
ePrint = ePutStrLn . show
#else
import qualified System.IO
import Prelude (IO, Read, Show, String)
interact = System.IO.interact
putChar = System.IO.putChar
putStr = System.IO.putStr
putStrLn = System.IO.putStrLn
getChar = System.IO.getChar
getLine = System.IO.getLine
getContents = System.IO.getContents
ePutChar = System.IO.hPutChar System.IO.stderr
ePutStr = System.IO.hPutStr System.IO.stderr
ePutStrLn = System.IO.hPutStrLn System.IO.stderr
print :: Show a => a -> IO ()
print = System.IO.print
readIO :: Read a => String -> IO a
readIO = System.IO.readIO
readLn :: Read a => IO a
readLn = System.IO.readLn
ePrint :: Show a => a -> IO ()
ePrint = System.IO.hPrint System.IO.stderr
#endif
trace :: String -> a -> a
trace string expr = unsafePerformIO $ do
traceIO string
return expr
traceIO :: String -> IO ()
traceIO = ePutStrLn
然后,您可以使用其中包含的 I/O 函数而不是标准库函数。他们会检测输出是否被重定向;如果不是(即如果我们正在写入“真实”控制台),那么我们将绕过通常的 Haskell I/O 函数并使用
WriteConsoleW
直接写入 win32 控制台,unicode-aware win32 控制台功能。在非 windows 平台上,条件编译意味着这里的函数只是调用标准库的函数。如果你需要打印到 stderr,你应该使用(例如)
ePutStrLn
,不是 hPutStrLn stderr
;我们没有定义 hPutStrLn
. (定义一个是读者的练习!)
关于windows - Windows 上 Haskell 中的 Unicode 控制台 I/O,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/10779149/