python의 csv 모듈을 사용하여 .xlsx에서 읽기

sourcecode

python의 csv 모듈을 사용하여 .xlsx에서 읽기

codebag 2023. 7. 12. 23:46

python의 csv 모듈을 사용하여 .xlsx에서 읽기

.xlsx 형식의 엑셀 파일을 읽으려고 합니다.csv모듈, 하지만 나는 내 사투리와 인코딩이 지정되어 있어도 엑셀 파일을 사용할 때 운이 없습니다.아래에는 제가 시도한 다른 인코딩으로 저의 다양한 시도와 오류 결과를 보여줍니다.Python의 .xlsx 파일에서 읽을 수 있는 올바른 코딩, 구문 또는 모듈을 알려줄 수 있는 사람이 있다면 감사하겠습니다.

아래 코드에서 다음 오류가 발생합니다._csv.Error: line contains NULL byte

#!/usr/bin/python

import sys, csv

with open('filelocation.xlsx', "r+", encoding="Latin1")  as inputFile:
    csvReader = csv.reader(inputFile, dialect='excel')
    for row in csvReader:
        print(row)

아래 코드에서 다음 오류가 발생합니다.UnicodeDecodeError: 'utf-8' codec can't decode byte 0xcc in position 16: invalid continuation byte

#!/usr/bin/python

import sys, csv

with open('filelocation.xlsx', "r+", encoding="Latin1")  as inputFile:
    csvReader = csv.reader(inputFile, dialect='excel')
    for row in csvReader:
        print(row)

사용할 때utf-16에서encoding다음 오류가 발생합니다.UnicodeDecodeError: 'utf-16-le' codec can't decode bytes in position 570-571: illegal UTF-16 surrogate

Python's를 사용할 수 없습니다.csv독서용 도서관xlsx포맷된 파일다른 라이브러리를 설치하고 사용해야 합니다.예를 들어 다음과 같이 사용할 수 있습니다.

import openpyxl

wb = openpyxl.load_workbook("filelocation.xlsx")
ws = wb.active

for row in ws.iter_rows(values_only=True):
    print(row)

이렇게 하면 파일의 모든 행이 행 값 목록으로 표시됩니다.Python Excel 웹 사이트는 다른 가능한 예를 제공합니다.

또는 행 목록을 만들 수 있습니다.

import openpyxl

wb = openpyxl.load_workbook("filelocation.xlsx")
ws = wb.active

data = list(ws.iter_rows(values_only=True))

print(data)

참고: 이전 Excel 형식을 사용하는 경우.xls대신에 당신은 그것을 사용할 수 있습니다.xlrd도서관.이는 더 이상 지원되지 않습니다..xlsx포맷은 하지만.

import xlrd

workbook = xlrd.open_workbook("filelocation.xlsx")
sheet = workbook.sheet_by_index(0)
data = [sheet.row_values(rowx) for rowx in range(sheet.nrows)]

print(data)

여기 표준 라이브러리를 사용하여 매우 대략적인 구현이 있습니다.

def xlsx(fname, sheet=1):
    import zipfile
    from xml.etree.ElementTree import iterparse
    z = zipfile.ZipFile(fname)
    strings = [el.text for e, el in iterparse(z.open('xl/sharedStrings.xml')) if el.tag.endswith('}t')]
    rows = []
    row = {}
    value = ''
    for e, el in iterparse(z.open('xl/worksheets/sheet%s.xml' % sheet)):
        if el.tag.endswith('}v'):  # <v>84</v>
            value = el.text
        if el.tag.endswith('}c'):  # <c r="A3" t="s"><v>84</v></c>
            if el.attrib.get('t') == 's':
                value = strings[int(value)]
            column_name = ''.join(x for x in el.attrib['r'] if not x.isdigit())  # AZ22
            row[column_name] = value
            value = ''
        if el.tag.endswith('}row'):
            rows.append(row)
            row = {}
    return rows

(삭제된 질문에서 복사한 것입니다. https://stackoverflow.com/questions/4371163/reading-xlsx-files-using-python )

여기 표준 라이브러리를 사용하여 매우 대략적인 구현이 있습니다.

def xlsx(fname):
    import zipfile
    from xml.etree.ElementTree import iterparse
    z = zipfile.ZipFile(fname)
    strings = [el.text for e, el in iterparse(z.open('xl/sharedStrings.xml')) if el.tag.endswith('}t')]
    rows = []
    row = {}
    value = ''
    for e, el in iterparse(z.open('xl/worksheets/sheet1.xml')):
        if el.tag.endswith('}v'):  # <v>84</v>
            value = el.text
        if el.tag.endswith('}c'):  # <c r="A3" t="s"><v>84</v></c>
            if el.attrib.get('t') == 's':
                value = strings[int(value)]
            letter = el.attrib['r'] # AZ22
            while letter[-1].isdigit():
                letter = letter[:-1]
            row[letter] = value
            value = ''
        if el.tag.endswith('}row'):
            rows.append(row)
            row = {}
    return rows

이 답변은 삭제된 질문에서 복사되었습니다. https://stackoverflow.com/a/22067980/131881

.xlsx 형식의 파일을 읽는 데 Python의 csv 라이브러리를 사용할 수 없습니다.또한 "pd.read_excel"을 사용할 수 없습니다. 이는 .xls만 지원합니다.아래는 .xlsx를 가져오기 위해 만든 함수입니다.가져온 파일의 첫 번째 행에 열 이름을 할당합니다.꽤 솔직합니다.

def import_xlsx(filepath):
    wb=openpyxl.load_workbook(filename=filepath, data_only=True)
    ws = wb.active
    df = list(ws.iter_rows(values_only=True))
    new=pd.DataFrame(data=df)
    new1=new[1:]
    new1.columns=new[0:1].values[0].tolist()
    return(new1)

예:

new_df=import_xlsx('C:\\Users\big_boi\\documents\\my_file.xlsx')

언급URL : https://stackoverflow.com/questions/35744613/read-in-xlsx-with-csv-module-in-python

'sourcecode' 카테고리의 다른 글

파이썬에서 딕트에 도트 표기법을 사용하는 방법은 무엇입니까? (0)	2023.07.17
SQL Server: 로그인은 성공했지만 "데이터베이스 [dbName]에 액세스할 수 없습니다. (Object Explorer)" (0)	2023.07.12
Firebase용 클라우드 기능 - 청구 계정이 구성되지 않았습니다. (0)	2023.07.12
리포지토리 액세스가 거부되었습니다.배포 키를 통한 액세스가 읽기 전용임 (0)	2023.07.12
raise Not Implemented Error'를 사용해야 하는 경우 (0)	2023.07.12

현재글python의 csv 모듈을 사용하여 .xlsx에서 읽기

각종 프로그래밍 정보를 다루는 블로그입니다.

ReactJS, python, Bash, wordpress, GIT, Android, mysql, jQuery, TypeScript, oracle, JSON, PowerShell, ajax, Excel, spring-boot, AngularJS, MongoDB, C, mariadb, sql-server,

Today :
Yesterday :

일	월	화	수	목	금	토
				1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30	31

codebag

python의 csv 모듈을 사용하여 .xlsx에서 읽기

python의 csv 모듈을 사용하여 .xlsx에서 읽기

'sourcecode' 카테고리의 다른 글

'sourcecode'의 다른글

티스토리툴바

python의 csv 모듈을 사용하여 .xlsx에서 읽기

python의 csv 모듈을 사용하여 .xlsx에서 읽기

'sourcecode' 카테고리의 다른 글

'sourcecode'의 다른글

관련글

티스토리툴바