一、docx转html bat
通过写一个bat脚本实现docx文件的批量转换为html文件,代码示例如下:
@echo off
setlocal EnableDelayedExpansion
for /r %%i in (*.docx) do (
set ename=%%~ni
set ename=!ename:~0,-4!
"C:\Program Files\Microsoft Office\root\Office16\Wordconv.exe" -oice -nme "%%i" -out "%%~pi!ename!.html" -shtml
)
pause
二、docx转word
如果需要在docx和html之间进行转换,可以先将docx转换为word,再将word转换为html,代码示例如下:
from win32com import client
import os
def docx_to_word(docx_path, word_path):
word = client.Dispatch('Word.Application')
docx = word.Documents.Open(docx_path)
docx.SaveAs(word_path, 16)
docx.Close()
word.Quit()
def word_to_html(word_path, html_path):
word = client.Dispatch('Word.Application')
html_doc = word.Documents.Add(word_path)
html_doc.SaveAs2(html_path, 8)
html_doc.Close()
word.Quit()
docx_to_word("example.docx", "example.doc")
word_to_html("example.doc", "example.html")
三、docx转html乱码
在转换docx为html时,有可能会出现中文乱码的问题,需要进行编码转换,代码示例如下:
import mammoth
with open("example.docx", "rb") as docx_file:
result = mammoth.convert_to_html(docx_file, convert_image = mammoth.images.img_element(convert = mammoth.images.relative_to_document("example.docx")))
with open("example.html", "w", encoding = "utf-8") as html_file:
html_file.write(result.value)
四、docx转为txt
将docx转换为txt文件可以使用Python-docx库,在转换之前需要安装Python-docx库,代码示例如下:
from docx import Document
def docx_to_txt(docx_path, txt_path):
document = Document(docx_path)
with open(txt_path, "w", encoding = "utf-8") as txt_file:
for para in document.paragraphs:
txt_file.write(para.text + "\n")
docx_to_txt("example.docx", "example.txt")
五、docx转excel
将docx转换为excel文件可以使用Python-docx库和openpyxl库,在转换之前需要安装这两个库,代码示例如下:
from docx import Document
from openpyxl import Workbook
def docx_to_excel(docx_path, excel_path):
document = Document(docx_path)
wb = Workbook()
ws = wb.active
for table in document.tables:
for i, row in enumerate(table.rows):
for j, cell in enumerate(row.cells):
ws.cell(row = i + 1, column = j + 1, value = cell.text)
wb.save(excel_path)
docx_to_excel("example.docx", "example.xlsx")
六、转docx文档
如果需要将html转换为docx文档,可以使用Python-docx库,代码示例如下:
from docx import Document
from docx.shared import Inches
document = Document()
with open("example.html", "r", encoding = "utf-8") as html_file:
content = html_file.read()
document.add_paragraph(content)
document.add_picture("example.png", width = Inches(6))
document.save("example.docx")
七、docx转为doc格式
如果需要将docx转换为doc格式,可以使用Python-docx库和win32com库,代码示例如下:
from docx import Document
from win32com import client
def docx_to_doc(docx_path, doc_path):
document = Document(docx_path)
document.save(doc_path)
word = client.Dispatch("Word.Application")
doc = word.Documents.Open(doc_path)
doc.SaveAs2(doc_path[:-1], 0)
doc.Close()
word.Quit()
docx_to_doc("example.docx", "example.doc")
八、docx转txt
将docx转换为txt文件可以使用Python-docx库,在转换之前需要安装Python-docx库,代码示例如下:
from docx import Document
def docx_to_txt(docx_path, txt_path):
document = Document(docx_path)
with open(txt_path, "w", encoding = "utf-8") as txt_file:
for para in document.paragraphs:
txt_file.write(para.text + "\n")
docx_to_txt("example.docx", "example.txt")
九、docx转doc
如果需要将docx转换为doc格式,可以使用Python-docx库和win32com库,代码示例如下:
from docx import Document
from win32com import client
def docx_to_doc(docx_path, doc_path):
document = Document(docx_path)
document.save(doc_path)
docx_to_doc("example.docx", "example.doc")