gpt4 book ai didi

data-analysis - 将非结构化医学文本数据处理为 CSV 的工具/方法

转载 作者:行者123 更新时间:2023-12-02 03:17:36 24 4
gpt4 key购买 nike

10/03/2014 16:55  Local Title: TRANSFER OUT NOTE
Standard Title: TRANSFER SUMMARIZATION NOTE
AUTHOR: D,WARD

XYZ MEDICAL INSTITUTE
ABC NAGAR, PQW CITY-101011
******************************************************************
TRANSFER OUT NOTE
******************* OCT 03, 2014

UHID:000-01-0202 PATIENT NAME: NAME , SINGH
AGE/SEX:42/FEMALE

DOA:Sep 30,2014

DEPARTMENT:GYNAE AND OBSTETRICS UNIT:II

TRANSFERRED FROM:D3

NAME , SINGH 000-01-0202 DOB: 01/01/1972





TRANSFERRED TO : MCU

DIAGNOSIS:pop- em lscs with male baby nicu B


TREATMENT:
inj.cefazolin 1 gm bd
inj.rantac 1 amp tds
inj.perinorm 1 amp tds
inj.pcm 1 gm tds
inj.texid 1 gm tds


PATIENT STATUS AT THE TIME OF SHIFTING:
g.c. fair on iv fluid ..


NAME , SINGH 000-01-0202 DOB: 01/01/1972




VITALS AT THE TIME OF SHIFTING:
TEMP:98.6F

HR:88/MIN RR:24/MIN

GCS: E V M


< THE ABOVE NOTE IS UNSIGNED >
- DRAFT COPY * DRAFT COPY * DRAFT COPY * DRAFT COPY * DRAFT COPY * DRAFT COPY -

09/21/2014 23:01 Local Title: MED ONCO IRCH DISCHARGE SUMMARY
Standard Title: DISCHARGE SUMMARY
AUTHOR: KUMAR,UVW

LOCAL TITLE: MED ONCO IRCH DISCHARGE SUMMARY
STANDARD TITLE: DISCHARGE SUMMARY

NAME , SINGH 000-01-0202 DOB: 01/01/1972




DATE OF NOTE: SEP 21, 2014@22:04 ENTRY DATE: SEP 21, 2014@22:04:42
AUTHOR: UVW KUMAR

REGISTRATION DETAILS
********************
UHID No:000-01-0202 IRCH No:000222 CR No:111000
NAME: NAME AGE:22 YEAR GENDER:MALE
DOA:Sep 2, 2014 DOD:Sep 18, 2014 DURATION OF STAY: days
WARD: MRO Ward BED No:14
CONSULTANT INCHARGE:Dr UVW Kumar

DIAGNOSIS & REASON FOR CURRENT ADMISSION
****************************************
DIAGNOSIS:Acute Promyelocytic leukemia (Intermediate Risk)

ADMITTED FOR :Chemotherapy
CASE SUMMARY:NAME Singh presented with complaints of bleeding gums, fever,

NAME , SINGH 000-01-0202 DOB: 01/01/1972




blurring of vision and gum hypertrophy. He diagnosed as APML in PQW
hospital based on PS, BMA and PML/RARa positive. He started on ATRA and after
that reffered here. His basline hemorem at PQW Hospital was s/o Hb :
4.6, TLC: 1580/cu.mm, Platlet: 6000/cu.mm. So he is classified as
intermideate risk APML. After coming here diagnosis reconfirmed,
daunorubicin given 60mg/m2 and continoued on ATRA. No features of
ATRA syndrome noticed during ward stay. His fibrinogen level were > 450
mg/dl. He remained afebrile and hemodynamically stable and dischared on
stable condition.

PRESENTATION AT CURRENT ADMISSION
*********************************
VITAL SIGNS:
TEMP:99 F RESP:19/min PULSE:98/min
BP:121/78 mm of Hg SPO2:99% on RA



NAME , SINGH 000-01-0202 DOB: 01/01/1972




GENERAL PHYSICAL EXAMINATION: PERFORMANCE STATUS: I
PALLOR:+ ICTERUS:- OEDEMA:- CYANOSIS:-
STERNAL TENDERNESS:- CLUBBING:- GUM HYPERTROPHY:+
LYMPHNODES: -

BIOMETRIC DETAILS: WEIGHT: 45 kg HEIGHT:166 cms BSA: 1.4 m2

INVESTIGATIONS AT CURRENT ADMISSSION
************************************
PS (3/9/2014) : N2, L8, E-, M1, B-, Meta-, Myelo-, Blast 89%. Blast and abnormal

promyelocytes present. F/S/O Acute promyelocytic leukemia.

BMA (3/9/2014): Cellular BM shows 90% blast and abnormal promyelocyte. F/S/O
APML.

Flow Cytometery (3/9/2014): 87% abnormal promyelocyte, Positive : CD45, CD15,

NAME , SINGH 000-01-0202 DOB: 01/01/1972




CD11b, CD13, CD33, CD64, CD9, CD18, cMPO.
Negative for CD2, CD14, CD117, CD19, HLADR, CCD79a, cCD3.

Day 12 PS (9/9/2014): N78, L20, E-, M2, B-, Meta-, Myelo_ Promyelo Nil, Blast
Nil.


Condition at discharge:
VITAL SIGNS:
TEMP:99 F RESP:18/min PULSE:78/min
BP:112/74 mm of Hg SPO2:99% on RA


Plan At discharge and follow up: As written in OPD card




NAME , SINGH 000-01-0202 DOB: 01/01/1972






< THE ABOVE NOTE IS UNSIGNED >
- DRAFT COPY * DRAFT COPY * DRAFT COPY * DRAFT COPY * DRAFT COPY * DRAFT COPY -

09/21/2014 22:04 Local Title: MED ONCO IRCH DISCHARGE SUMMARY
Standard Title: DISCHARGE SUMMARY
AUTHOR: UVW,AMIT

REGISTRATION DETAILS
********************
UHID No:000-01-0202 IRCH No:000222 CR No:111000
NAME: NAME , SINGH AGE:42 GENDER:FEMALE
DOA:Sep 2, 2014 DOD:Sep 18, 2014 DURATION OF STAY: days
WARD: MRO Ward BED No:14
CONSULTANT INCHARGE:Dr Lalit Kumar
ADDRESS: ,

NAME , SINGH 000-01-0202 DOB: 01/01/1972




DIAGNOSIS & REASON FOR CURRENT ADMISSION
****************************************
DIAGNOSIS:
Acute Promyelocytic leukemia (Intermediate Risk)

ADMITTED FOR :Chemotherapy
CASE SUMMARY:NAME Singh presented with complaints of bleeding gums,
fever, blurring of vision and gum hypertrophy. He diagnosed as APML in
UVW hospital based on PS and PML/RARa positive. He started on ATRA and
after that reffered to XYZ hospital

PRESENTATION AT CURRENT ADMISSION
*********************************
VITAL SIGNS:
TEMP:F RESP:/min PULSE:/min
BP:/mm of Hg SPO2:%


NAME , SINGH 000-01-0202 DOB: 01/01/1972





GENERAL PHYSICAL EXAMINATION: PERFORMANCE STATUS:
PALLOR: ICTERUS: OEDEMA: CYANOSIS:
STERNAL TENDERNESS: CLUBBING: GUM HYPERTROPHY:
LYMPHNODES:


SPECIFIC FINDINGS:

BIOMETRIC DETAILS: WEIGHT:kgS HEIGHT:cms BSA: m2
INVESTIGATIONS AT CURRENT ADMISSSION
************************************


< THE ABOVE NOTE IS UNSIGNED >
- DRAFT COPY * DRAFT COPY * DRAFT COPY * DRAFT COPY * DRAFT COPY * DRAFT COPY -


NAME , SINGH 000-01-0202 DOB: 01/01/1972

这是我需要转换为 CSV 的文本内容。这是一位多次来医院的患者的详细信息。我想在不同的列头中提取医疗数据[年龄、性别、UHID、DOA、部门、诊断、治疗、患者状态、生命体征、本地职称、标准职称、病例摘要、入院、一般体检]。

正如您所看到的“诊断”的重复,并且列名也可能不同。

要处理的文件大小为 15GB。

请提出解决问题的方法。我尝试使用 python、openrefine 和 ctakes 工具。

请告诉我如何解决此类问题。限制是我们只能使用开源免费工具。

最佳答案

你可以用 gawk 做一些这样的事情。多行字段,如生命体征和治疗,可能很难硬塞进 CSV 格式,但这里是单值字段的开始。

function dump() {
print age "," sex "," uhid "," doa "," dept "," diagnosis
}

BEGIN { onfirst = 1 }
END { dump() }

{
sub(/^ */, "")
sub(/UHID No/, "UHID")
}


match($0, /UHID:([^ ]*)/, a) {
if(onfirst)
onfirst = 0
else
dump()
uhid = a[1]
}

match($0, /AGE\/SEX:([0-9]*)\/(.*[^ ]) *$/, a) {
age = a[1]
sex = a[2]
}

match($0, /DOA:([^ ][^ ]* *[^ ][^ ]* *[^ ][^ ]*)/, a) {
doa = a[1]
}

match($0, /DEPARTMENT:(.*[^ ]) *UNIT/, a) {
dept = a[1]
}

match($0, /DIAGNOSIS:(.*[^ ]) *$/, a) {
diagnosis = a[1]
}

关于data-analysis - 将非结构化医学文本数据处理为 CSV 的工具/方法,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/35650679/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com