NoteExpress

 找回密码
 立即注册
搜索
热搜: NE3 NE 3 已解决
查看: 634|回复: 2

导入PDF文件

[复制链接]

38

主题

140

帖子

605

积分

高级会员

Rank: 4

积分
605
发表于 2016-12-10 14:24:27 | 显示全部楼层 |阅读模式
读取PDF文件的原理是什么呢?是不是通过首页中的不同字体差异进行标题读取的?为何有些文献PDF无法正常获取信息?

回复

使用道具 举报

38

主题

140

帖子

605

积分

高级会员

Rank: 4

积分
605
 楼主| 发表于 2016-12-10 14:34:13 | 显示全部楼层
The automatic extraction of document details (authors, title, journals etc.) from a research paper works in several steps:
The contents of the PDF are analyzed and Mendeley tries to 'guess' which text constitutes the authors, title and other metadata. The accuracy of this step will depend on factors such as the complexity of the article's layout.
Mendeley looks for identifiers such as DOIs and Arxiv IDs in the paper.
Mendeley sends the extracted metadata and any identifiers found to Mendeley Web which in turn queries various online sources, such as Arxiv, PubMed and CrossRef for more accurate data. If better quality metadata can be found online it is used, otherwise the document details extracted from the contents of the PDF are used.
The extraction process is imperfect but we are working to improving the quality of the automatic extraction and the comprehensiveness of the data available on Mendeley Web.
回复 支持 反对

使用道具 举报

166

主题

1万

帖子

2万

积分

管理员

Rank: 9Rank: 9Rank: 9

积分
29201
发表于 2016-12-13 10:01:38 | 显示全部楼层
无法正确获取标题的,还请手动添加吧。谢谢
回复 支持 反对

使用道具 举报

*滑块验证:
您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

小黑屋|NoteExpress

GMT+8, 2024-11-30 11:31 , Processed in 0.105611 second(s), 22 queries .

Powered by Discuz! X3.4

Copyright © 2001-2021, Tencent Cloud.

快速回复 返回顶部 返回列表