Skip to main content

View Post [edit]

Poster: andrewbontrager Date: Jun 5, 2012 10:47am
Forum: texts Subject: problem viewing full text

Hello. When I go to
archive.org/details/UnitedStatesPatent282
and click on full text link, only the first line of text comes up, but if I go into the all files http link and click on the text file there, it will show fine. Can someone tell me what's wrong with the text files I am uploading? Thanks very much.

Reply [edit]

Poster: garthus1 Date: Jun 6, 2012 12:38pm
Forum: texts Subject: Re: problem viewing full text

Andrew,

Which file are you trying to look at?

Gerry

Reply [edit]

Poster: andrewbontrager Date: Jun 7, 2012 8:28am
Forum: texts Subject: Re: problem viewing full text

I am trying to be sure the full text link brings up the text, and I can't understand why the text file comes up fine in the HTTP directory but not from the full text link. If you can see it fine please let me know, it might be just my computer. Thanks.

Reply [edit]

Poster: garthus1 Date: Jun 7, 2012 9:22am
Forum: texts Subject: Re: problem viewing full text

Andrew,

I need the link for the item, post it here.

Gerry

Reply [edit]

Poster: stbalbach Date: Jun 7, 2012 10:51am
Forum: texts Subject: Re: problem viewing full text

I think Andrew means if you go to

http://archive.org/details/UnitedStatesPatent282

Then click on "Full Text" (in the left side "View the Book" window), nothing shows on the screen. But if you go to "All Files: HTTP", it's possible to view it that way. It does appear to be a bug of some sort.

Reply [edit]

Poster: garthus1 Date: Jun 7, 2012 1:26pm
Forum: texts Subject: Re: problem viewing full text

Andrew,

Something wrong with the file derivation, if you upload a file it should derive more formats than you have displayed. Try uploading it again or go to edit metadata->items and then re-derive the document.

Also what type of PDF file are you using, you should create it in Irfanview and then save it as a PDF, sometimes Adobe Acrobat or Microsoft corrupts the PDF file and it does not derive properly.

Gerry

Reply [edit]

Poster: andrewbontrager Date: Jun 7, 2012 11:01am
Forum: texts Subject: Re: problem viewing full text

Precisely. I finally got the text to where it doesn't run off the screen like most of my items, but now the text doesn't show at all from the full text link. Hope the administrators can fix it, glad it's nothing on my computer. All my new additions to
archive.org/details/uspatentssingledocuments
are coming up with the glitch too. Thanks guys.

Reply [edit]

Poster: stbalbach Date: Jun 7, 2012 1:15pm
Forum: texts Subject: Re: problem viewing full text

Andrew, if all the uploads have the same problem it may be something in the text itself. Perhaps embedded codes, or unix vs dos line breaks, something like that. The streaming mode (Full Text) could be trying to display it as a single long line without line breaks, but exceeding a buffer so doesn't display. Do you have access to a unix box you could try running it through unix2dos (or dos2unix) to convert line breaks? These are just guesses, an admin will have to take a look probably.

Reply [edit]

Poster: andrewbontrager Date: Jun 8, 2012 7:40am
Forum: texts Subject: Re: problem viewing full text

As far as I know, there aren't any imbedded codes; however, I edit these documents on a nonstandard PDA and that's why the files come out in such long lines.

Reply [edit]

Poster: stbalbach Date: Jun 8, 2012 7:55am
Forum: texts Subject: Re: problem viewing full text

Line break codes are invisible.

Try this site

http://www.fileformat.info/convert/text/unix2dos.tr

Convert before uploading. See if that solves.

Reply [edit]

Poster: andrewbontrager Date: Jun 8, 2012 12:49pm
Forum: texts Subject: Re: problem viewing full text

Thanks for the tip, I tried sending a txt file to the site and came up with a 500 server error, shoot.

Reply [edit]

Poster: stbalbach Date: Jun 8, 2012 2:30pm
Forum: texts Subject: Re: problem viewing full text

Ok sorry site appears broken. How about converting to another format using this site:

http://document.online-convert.com/

Try any number of formats, such as text, html, pdf, doc -- then upload and see how Internet Archive displays.

Reply [edit]

Poster: andrewbontrager Date: Jun 8, 2012 8:37pm
Forum: texts Subject: Re: problem viewing full text

Good idea, but I want to keep with the txt format.

Reply [edit]

Poster: aibek Date: Nov 7, 2012 11:35pm
Forum: texts Subject: Re: problem viewing full text

andrewbontrager, My Unix computer shows that your file 282.txt is (using the `file' command): Non-ISO extended-ASCII English text, with CR line terminators This is highly non-standard. Something is wrong with your file. A www search for the string “Non-ISO extended-ASCII English text” brings: “I realized that the majority of the file is in ISO-8859-1 and some parts in utf-8.…” http://stackoverflow.com/questions/5901633/perl-file-encoding-and-word-comparison I will soon check your file and offer suggestions.
This post was modified by aibek on 2012-11-08 07:35:18

Reply [edit]

Poster: aibek Date: Nov 8, 2012 12:23am
Forum: texts Subject: Re: problem viewing full text

You have an invalid UTF-8 character in your file. Line 59 starts with: What I claim as my improvement is After that there is a 0x97 (octal \227) followed by two 0x0d (the Mac line termination). This sequence is invalid UTF-8. The proper way to correct this error is to replace the \227 with the proper character in UTF-8 encoding. The no-brainer way is to put a hyphen there, making the file ASCII. Or you may put an en-dash or an em-dash or a quotation-dash. In all the above cases, the file would be valid UTF-8; plain ASCII is, obviously, valid UTF-8. Try to find out how you got such an exotic character, so that it does not happen with random files! --- Output of `isutf8': 282.txt: line 1, char 1, byte offset 2913: invalid UTF-8 code Valid UTF-8 codes: – en-dash (U+2013) — em-dash (U+2014) ― quotation-dash (U+2015) (You can copy these from this page.) To know which one to use, consult your style guide! Check this, though: https://en.wikipedia.org/wiki/Dash --- Note that the culprit character may not be visible in normal text editors even though it is there. (Some editors will even refuse to open the file.) Emacs, vim and hex editors show the character.
This post was modified by aibek on 2012-11-08 08:15:36
This post was modified by aibek on 2012-11-08 08:23:53

Reply [edit]

Poster: aibek Date: Nov 8, 2012 1:41am
Forum: texts Subject: Re: problem viewing full text

I found out how you got the character. Either (i) your text editor is set to save files in ANSI aka Windows-1252 aka cp1252. In this encoding 0x97 is the value for an em-dash. (See the link.) Or, (ii) you copied your em-dash from a file in the Windows-1252 encoding and pasted in you text file. (The effect of both the steps is the same -- your file is effectively in Win-1252 encoding.) The solution to your problem is simple: convert the file from Win-1252 encoding to UTF-8 encoding. There are tools to do it automatically in Unix (the best is `iconv'). Or do it online (see the link below). Or, if you provide a collection of all the text files to me, I will convert it for you -- it is trivial on my machine. So that the problem does not happen in future, set your text editor to use UTF-8 encoding. Win-1252 is deprecated. (See link.) Or, if the (ii) above is true, stop using your ‘master’-em-dash (which you copy to the files you edit.) http://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WindowsBestFit/bestfit1252.txt http://kanjidict.stc.cx/recode.php https://en.wikipedia.org/wiki/Code_page#Windows_.28ANSI.29_code_pages
This post was modified by aibek on 2012-11-08 09:41:27

Reply [edit]

Poster: bravekami Date: Aug 23, 2016 6:25am
Forum: texts Subject: Re: problem viewing full text

Per Minute Charges for Viewing Full Text Documents Range from. Depending on the type of content you are searching, viewing the full text of an individual document will cause the Results Groups panel on the left to display different groups. Depending on the type of content you are searching, viewing the full text of an individual document will cause the Results Groups panel on the left to display different groups.
http://converteditpdftoworddocconverterfree.blogspot.com/