gpt4 book ai didi

r - 第一个数字之前的 gsub 字符串,包含大写和小写字符

转载 作者:行者123 更新时间:2023-12-01 12:08:21 25 4
gpt4 key购买 nike

删除第一个数字之后的所有内容。我拥有的数据如下所示:

[1] NA                                   "ITEM 1. BUSINESS"                  
[3] "ITEM 1A. RISK FACTORS" "ITEM 1B. UNRESOLVED STAFF COMMENTS"
[5] "ITEM 2. PROPERTIES" "ITEM 3. LEGAL PROCEEDINGS"

我努力保持这样我就有了

NA           ITEM1
ITEM1A ITEM1B
ITEM2 ITEM3

(甚至保留 ITEM 1、ITEM 2 等之间的空格)

我已经尝试了以下但没有任何运气。

x <- toupper(x)
x <- gsub("[^[:alnum:][:space:]]","", x)
x <- gsub(" ", "", x)
x <- substr(x, start = 1, stop = 7)
x <- gsub("\\[digits]*","", x)

还试过:

    y <- str_extract(x, "Item")
y <- str_extract(toupper(words$item), "ITEM")

数据:

c(NA, "ITEM 1. BUSINESS", "ITEM 1A. RISK FACTORS", "ITEM 1B. UNRESOLVED STAFF COMMENTS", 
"ITEM 2. PROPERTIES", "ITEM 3. LEGAL PROCEEDINGS", "ITEM 4. MINE SAFETY DISCLOSURES",
"ITEM 5. MARKET FOR REGISTRANT’S COMMON EQUITY, RELATED STOCKHOLDER MATTERS AND ISSUER PURCHASES OF EQUITY SECURITIES",
"ITEM 6. SELECTED FINANCIAL DATA ", "ITEM 7. MANAGEMENT’S DISCUSSION AND ANALYSIS OF FINANCIAL CONDITION AND RESULTS OF OPERATIONS ",
"ITEM 7A. QUANTITATIVE AND QUALITATIVE DISCLOSURES ABOUT MARKET RISK",
"ITEM 8. FINANCIAL STATEMENTS AND SUPPLEMENTARY DATA", "ITEM 9. CHANGES IN AND DISAGREEMENTS WITH ACCOUNTANTS ON ACCOUNTING AND FINANCIAL DISCLOSURE",
"ITEM 9A. CONTROLS AND PROCEDURES", "ITEM 9B. OTHER INFORMATION",
"ITEM 10. DIRECTORS, EXECUTIVE OFFICERS AND CORPORATE GOVERNANCE",
"ITEM 11. EXECUTIVE COMPENSATION", "ITEM 12. SECURITY OWNERSHIP OF CERTAIN BENEFICIAL OWNERS AND MANAGEMENT AND RELATED STOCKHOLDER MATTERS",
"ITEM 13. CERTAIN RELATIONSHIPS AND RELATED TRANSACTIONS, AND DIRECTOR INDEPENDENCE",
"ITEM 14. PRINCIPAL ACCOUNTING FEES AND SERVICES", "ITEM 15. EXHIBITS, FINANCIAL STATEMENT SCHEDULE",
"Item 1. Business", "Item 1A. Risk Factors", "Item 1B. Unresolved Staff Comments",
"Item 2. Properties", "Item 3. Legal Proceedings", "Item 4. Mine Safety Disclosure",
"Item 5. Market for Registrant’s Common Equity, Related Stockholder Matters and Issuer Purchases of Equity Securities",
"Item 6. Selected Financial Data", "Item 7. Management’s Discussion and Analysis of Financial Condition and Results of Operations",
"Item 7A. Quantitative and Qualitative Disclosures About Market Risk",
"Item 8. Financial Statements and Supplementary Data", "Item 9. Changes in and Disagreements with Accountants on Accounting and Financial Disclosure",
"Item 9A. Controls and Procedures", "Item 9B. Other Information",
"Item 10. Directors, Executive Officers and Corporate Governance",
"Item 11. Executive Compensation", "Item 12. Security Ownership of Certain Beneficial Owners and Management and Related Stockholder Matters",
"Item 13. Certain Relationships and Related Transactions, and Director Independence",
"Item 14. Principal Accountant Fees and Services", "Item 15. Exhibits and Financial Statement Schedules(a)(1) and (2). The following documents have been included in Part II, Item 8. Report of Ernst & Young LLP, Independent Registered Public Accounting Firm, on Financial Statements Consolidated Statements of Financial Position — As of December 31, 2017 and 2016 Consolidated Statements of Income — Years Ended December 31, 2017, 2016 and 2015 Consolidated Statements of Comprehensive Income — Years Ended December 31, 2017, 2016 and 2015 Consolidated Statements of Shareholders’ Equity — Years Ended December 31, 2017, 2016 and 2015 Consolidated Statements of Cash Flows — Years Ended December 31, 2017, 2016 and 2015 Notes to Consolidated Financial Statements",
"Item 1. Business.", "Item 1A. Risk Factors.", "Item 1B. Unresolved Staff Comments.",
"Item 2. Properties.", "Item 3. Legal Proceedings.", "Item 4. Mine Safety Disclosures.",
"Item 5. Market for Registrant's Common Equity, Related Stockholder Matters and Issuer Purchases of Equity Securities.",
"Item 6. Selected Financial Data.", "Item 7. Management's Discussion and Analysis of Financial Condition and Results of Operations. ",
"Item 7A. Quantitative and Qualitative Disclosures About Market Risk.",
"Item 8. Financial Statements and Supplementary Data.", "Item 9. Changes in and Disagreements with Accountants on Accounting and Financial Disclosure.",
"Item 9A. Controls and Procedures.", "Item 9B. Other Information.",
"Item 10. Directors, Executive Officers and Corporate Governance.",
"Item 11. Executive Compensation.", "Item 12. Security Ownership of Certain Beneficial Owners and Management and Related Stockholder Matters.",
"Item 13. Certain Relationships and Related Transactions, and Director Independence.",
"Item 14. Principal Accounting Fees and Services.", "Item 15. Exhibits, Financial Statement Schedules.",
"Item 16. Form 10-K Summary.", "Item 4. Mine Safety Disclosures",
"Item 4A. Executive Officers", "Item 5. Market for the Registrant's Common Equity, Related Stockholder Matters and Issuer Purchases of Equity Securities",
"Item 6. Selected Financial Data", "Item 7. Management's Discussion and Analysis of Financial Condition and Results of Operations",
"Item 8. Financial Statements and Supplementary Data", "Item 15. Exhibits, Financial Statement Schedules"
)

最佳答案

我们可以使用 sub 来捕获一个或多个不是数字后跟数字的字符作为一组,在替换中使用反向引用 (\\1)捕获的组。

x1 <- sub("^([^0-9]+[0-9]+[A-Za-z]*).*", "\\1", x)
x1
#[1] NA "ITEM 1" "ITEM 1A" "ITEM 1B" "ITEM 2" "ITEM 3" "ITEM 4" "ITEM 5" "ITEM 6" "ITEM 7" "ITEM 7A" "ITEM 8" "ITEM 9"
#[14] "ITEM 9A" "ITEM 9B" "ITEM 10" "ITEM 11" "ITEM 12" "ITEM 13" "ITEM 14" "ITEM 15" "Item 1" "Item 1A" "Item 1B" "Item 2" "Item 3"
#[27] "Item 4" "Item 5" "Item 6" "Item 7" "Item 7A" "Item 8" "Item 9" "Item 9A" "Item 9B" "Item 10" "Item 11" "Item 12" "Item 13"
#[40] "Item 14" "Item 15" "Item 1" "Item 1A" "Item 1B" "Item 2" "Item 3" "Item 4" "Item 5" "Item 6" "Item 7" "Item 7A" "Item 8"
#[53] "Item 9" "Item 9A" "Item 9B" "Item 10" "Item 11" "Item 12" "Item 13" "Item 14" "Item 15" "Item 16" "Item 4" "Item 4A" "Item 5"
#[66] "Item 6" "Item 7" "Item 8" "Item 15"

如果我们想删除所有空格,则使用 sub

删除空格
x2 <- sub("\\s+", "", toupper(x1))
head(x2)
#[1] NA "ITEM1" "ITEM1A" "ITEM1B" "ITEM2" "ITEM3"

关于r - 第一个数字之前的 gsub 字符串,包含大写和小写字符,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/54408849/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com