我正在开发 SSIS,并且有复杂的非结构化文本文件,我必须通过创建 SSIS 包来解析文本文件并在数据库中获取所需列的数据。解析文本文件的最佳方法是什么以及如何进行我可以编写脚本来读取该文本文件中的每一行吗?我还很困惑是否可以在不编写脚本的情况下读取 TEXT 文件的每一行?
文本文件数据中的必需列是 DEVICEID、DATAVALUE 和 DATAUNITS:
这是文本文件:
12/02/2015 09:47:44:745 SecureHARTPort version: 1.1.12.0.
12/02/2015 09:47:44:745 Connecting and initialing Session to
67.40.65.181 Port:5094 Tcp
12/02/2015 09:47:44:745 Tx: Message Header: Ver: 1, MsgType: 0, MsgId: 0
Status: 0x00
TranId: 1, Data ByteCount: 5
Data: 01 00 09 27 C0
12/02/2015 09:47:44:761 Rx: Message Header: Ver: 1, MsgType: 1, MsgId: 0
Status: 0x00
TranId: 1, Data ByteCount: 5
Data: 01 00 09 27 C0
12/02/2015 09:47:44:855 Tx: Message Header: Ver: 1, MsgType: 0, MsgId: 3
Status: 0x00
TranId: 2, Data ByteCount: 5
Data: 02 80 00 00 82
12/02/2015 09:47:44:855 Rx: Message Header: Ver: 1, MsgType: 1, MsgId: 3
Status: 0x00
TranId: 2, Data ByteCount: 29
Data: 06 80 00 18 00 50 FE 26 4E 05 07 05 02 0E 0C 0B 6A 64 05 04 00 01 50
00 26 00 26 84 8E
Rx Cmd=0, Rsp code=0x00, Device Status=0x50
Expansion Code=254
Expanded Device Type=9806
# Request Preambles=5
Universal Comand Revision Level=7
Transmitter HART Revision Level=5
Software Revision=2
Hardware Revision Level / Physical Signaling Code=14
Flags=0C
Device ID=748132
Minimum # Response Preambles=5
Max # of device variables=4
Configuration Change Counter=1
Extended Field Device Status=50
Manufacturer's ID=38
Private Label Distributor=38
Device Profile=132
12/02/2015 09:47:44:855 Tx: Message Header: Ver: 1, MsgType: 0, MsgId: 3
Status: 0x00
TranId: 3, Data ByteCount: 9
Data: 82 A6 4E 0B 6A 64 14 00 7B
12/02/2015 09:47:44:870 Rx: Message Header: Ver: 1, MsgType: 1, MsgId: 3
Status: 0x00
TranId: 3, Data ByteCount: 43
Data: 86 A6 4E 0B 6A 64 14 22 00 50 77 69 68 61 72 74 67 77 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0C
Rx Cmd=20, Rsp code=0x00, Device Status=0x50
Long Tag=wihartgw
12/02/2015 09:47:44:870 Tx: Message Header: Ver: 1, MsgType: 0, MsgId: 3
Status: 0x00
TranId: 4, Data ByteCount: 9
Data: 82 A6 4E 0B 6A 64 4A 00 25
12/02/2015 09:47:44:886 Rx: Message Header: Ver: 1, MsgType: 1, MsgId: 3
Status: 0x00
TranId: 4, Data ByteCount: 19
Data: 86 A6 4E 0B 6A 64 4A 0A 00 50 01 01 65 00 05 02 01 03 1B
Rx Cmd=74, Rsp code=0x00, Device Status=0x50
Max Num IO Cards=1
Max Num Channels per IO Card=1
Max Num Sub-Devices per Channel=101
Num Devices Detected=5
Max Num DR Supported=2
Master Mode for Comm=1
Retry Count for Sub-Device=3
Rx Cmd=9, Rsp code=0x00, Device Status=0x50
Extended Device Status=0
Slot0 Var Code=246
Slot0 Var Classification=0
Slot0 Var Units=251
Slot0 Var Value=4
Slot0 Var Status=C0
Slot1 Var Code=116
Slot1 Var Classification=209
Slot1 Var Units=70
Slot1 Var Value=0
最佳答案
不知道这是否对您有帮助,但是使用如下所示的 T-SQL 脚本,您可以首先逐行读取文本,然后使用适当的过滤器:
DECLARE @YourText NVARCHAR(MAX)=
N' 12/02/2015 09:47:44:745 SecureHARTPort version: 1.1.12.0.
12/02/2015 09:47:44:745 Connecting and initialing Session to
67.40.65.181 Port:5094 Tcp
12/02/2015 09:47:44:745 Tx: Message Header: Ver: 1, MsgType: 0, MsgId: 0
Status: 0x00
TranId: 1, Data ByteCount: 5
Data: 01 00 09 27 C0
12/02/2015 09:47:44:761 Rx: Message Header: Ver: 1, MsgType: 1, MsgId: 0
Status: 0x00
TranId: 1, Data ByteCount: 5
Data: 01 00 09 27 C0
12/02/2015 09:47:44:855 Tx: Message Header: Ver: 1, MsgType: 0, MsgId: 3
Status: 0x00
TranId: 2, Data ByteCount: 5
Data: 02 80 00 00 82
12/02/2015 09:47:44:855 Rx: Message Header: Ver: 1, MsgType: 1, MsgId: 3
Status: 0x00
TranId: 2, Data ByteCount: 29
Data: 06 80 00 18 00 50 FE 26 4E 05 07 05 02 0E 0C 0B 6A 64 05 04 00 01 50
00 26 00 26 84 8E
Rx Cmd=0, Rsp code=0x00, Device Status=0x50
Expansion Code=254
Expanded Device Type=9806
# Request Preambles=5
Universal Comand Revision Level=7
Transmitter HART Revision Level=5
Software Revision=2
Hardware Revision Level / Physical Signaling Code=14
Flags=0C
Device ID=748132
Minimum # Response Preambles=5
Max # of device variables=4
Configuration Change Counter=1
Extended Field Device Status=50
Manufacturer''s ID=38
Private Label Distributor=38
Device Profile=132
12/02/2015 09:47:44:855 Tx: Message Header: Ver: 1, MsgType: 0, MsgId: 3
Status: 0x00
TranId: 3, Data ByteCount: 9
Data: 82 A6 4E 0B 6A 64 14 00 7B
12/02/2015 09:47:44:870 Rx: Message Header: Ver: 1, MsgType: 1, MsgId: 3
Status: 0x00
TranId: 3, Data ByteCount: 43
Data: 86 A6 4E 0B 6A 64 14 22 00 50 77 69 68 61 72 74 67 77 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0C
Rx Cmd=20, Rsp code=0x00, Device Status=0x50
Long Tag=wihartgw
12/02/2015 09:47:44:870 Tx: Message Header: Ver: 1, MsgType: 0, MsgId: 3
Status: 0x00
TranId: 4, Data ByteCount: 9
Data: 82 A6 4E 0B 6A 64 4A 00 25
12/02/2015 09:47:44:886 Rx: Message Header: Ver: 1, MsgType: 1, MsgId: 3
Status: 0x00
TranId: 4, Data ByteCount: 19
Data: 86 A6 4E 0B 6A 64 4A 0A 00 50 01 01 65 00 05 02 01 03 1B
Rx Cmd=74, Rsp code=0x00, Device Status=0x50
Max Num IO Cards=1
Max Num Channels per IO Card=1
Max Num Sub-Devices per Channel=101
Num Devices Detected=5
Max Num DR Supported=2
Master Mode for Comm=1
Retry Count for Sub-Device=3
Rx Cmd=9, Rsp code=0x00, Device Status=0x50
Extended Device Status=0
Slot0 Var Code=246
Slot0 Var Classification=0
Slot0 Var Units=251
Slot0 Var Value=4
Slot0 Var Status=C0
Slot1 Var Code=116
Slot1 Var Classification=209
Slot1 Var Units=70
Slot1 Var Value=0';
--查询将在 CHAR(13) and/or CHAR(10)
的任意组合处剪切行。 :
WITH LineByLine AS
(
SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS LineNr
,LTRIM(RTRIM(x.value(N'(text())[1]',N'nvarchar(max)'))) AS Line
FROM
(
SELECT CAST(N'<x>' + REPLACE((SELECT REPLACE(REPLACE(REPLACE(@YourText,NCHAR(10),NCHAR(13)),NCHAR(13)+NCHAR(13),NCHAR(13)),NCHAR(13),N'\nl') AS [*] FOR XML PATH('')),N'\nl',N'</x><x>') + N'</x>'AS XML) AS Casted
) AS t
CROSS APPLY Casted.nodes(N'/x[text()]') AS A(x)
)
SELECT LineNr,Line
FROM LineByLine
WHERE CHARINDEX('Device ID=',Line)>0
OR CHARINDEX('Data:',Line)>0
OR CHARINDEX('unit',Line)>0;
结果将是:
Nr Line
7 Data: 01 00 09 27 C0
11 Data: 01 00 09 27 C0
15 Data: 02 80 00 00 82
19 Data: 06 80 00 18 00 50 FE 26 4E 05 07 05 02 0E 0C 0B 6A 64 05 04 00 01 50
30 Device ID=748132
41 Data: 82 A6 4E 0B 6A 64 14 00 7B
45 Data: 86 A6 4E 0B 6A 64 14 22 00 50 77 69 68 61 72 74 67 77 00 00 00 00 00
52 Data: 82 A6 4E 0B 6A 64 4A 00 25
56 Data: 86 A6 4E 0B 6A 64 4A 0A 00 50 01 01 65 00 05 02 01 03 1B
69 Slot0 Var Units=251
74 Slot1 Var Units=70
您没有说明您的预期输出,也没有说明文本中的列名称,所以这是猜测...希望它有所帮助...
关于sql - 在SSIS中解析非结构化文本文件并读取每一行以获取所需的数据,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/44506585/