JAVA读取PDF、WORD文档实例代码
时间:2022-02-02 09:34:38|栏目:JAVA代码|点击: 次
读取PDF文件jar引用
<dependency> <groupid>org.apache.pdfbox</groupid> pdfbox</artifactid> <version>1.8.13</version> </dependency>
读取WORD文件jar引用
<dependency> <groupid>org.apache.poi</groupid> poi-scratchpad</artifactid> <version>3.16-beta1</version> </dependency> <dependency> <groupid>org.apache.poi</groupid> poi</artifactid> <version>3.16-beta1</version> </dependency>
读取WORD文件方法
/**
*
* @Title: getTextFromWord
* @Description: 读取word
* @param filePath
* 文件路径
* @return: String 读出的Word的内容
*/
public static String getTextFromWord(String filePath) {
String result = null;
File file = new File(filePath);
FileInputStream fis = null;
try {
fis = new FileInputStream(file);
@SuppressWarnings("resource")
WordExtractor wordExtractor = new WordExtractor(fis);
result = wordExtractor.getText();
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
} finally {
if (fis != null) {
try {
fis.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
return result;
}
读取PDF文件方法
/**
*
* @Title: getTextFromPdf
* @Description: 读取pdf文件内容
* @param filePath
* @return: 读出的pdf的内容
*/
public static String getTextFromPdf(String filePath) {
String result = null;
FileInputStream is = null;
PDDocument document = null;
try {
is = new FileInputStream(filePath);
PDFParser parser = new PDFParser(is);
parser.parse();
document = parser.getPDDocument();
PDFTextStripper stripper = new PDFTextStripper();
result = stripper.getText(document);
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
} finally {
if (is != null) {
try {
is.close();
} catch (IOException e) {
e.printStackTrace();
}
}
if (document != null) {
try {
document.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
return result;
}
希望本篇实例代码可以帮到您


阅读排行
- 1Java Swing组件BoxLayout布局用法示例
- 2java中-jar 与nohup的对比
- 3Java邮件发送程序(可以同时发给多个地址、可以带附件)
- 4Caused by: java.lang.ClassNotFoundException: org.objectweb.asm.Type异常
- 5Java中自定义异常详解及实例代码
- 6深入理解Java中的克隆
- 7java读取excel文件的两种方法
- 8解析SpringSecurity+JWT认证流程实现
- 9spring boot里增加表单验证hibernate-validator并在freemarker模板里显示错误信息(推荐)
- 10深入解析java虚拟机




