Migrating to JStore First copy the content from emerson over to kinscoe.harcourt.com On emerson: # cd /content/http/tpc/psychcorp/www # tar cvf /tmp/tpcYYYYDDMMHHMM.tar . # chmod 755 /tmp/tpcYYYYDDMMHHMM.tar Back on kinscoe: # scp -l kinscoe@emerson:/tmp/tpcYYYYDDMMHHMM.tar /usr/local/apache1_3/htdocs/. Back on emerson: # rm /tmp/tpcYYYYDDMMHHMM.tar Now on kinscoe: # cd /usr/local/apache1_3/htdocs # rm -Rf new # mkdir new # rm -Rf tpc # mkdir tpc # cd new # tar xvf ../tpcYYYYDDMMHHMM.tar # cd tpc # tar xvf ../tpcYYYYDDMMHHMM.tar Now run the "Buy Now" conversion program: First make sure the followinf directories are set in the Perl code: # directory of existing content on web server $dir = "/usr/local/apache1_3/htdocs/tpc"; # mirror directory of web content for modified HTML files $newdir = "/usr/local/apache1_3/htdocs/new"; Now run the program: # /usr/local/admin/fixem.pl FixEm.pl: vers 1.3 Searching /usr/local/apache1_3/htdocs/tpc ... Thu Aug 15 11:09:41 2002 Search complete. 722 matching files (2785 matching lines) found out of 865 total HTML files. Stats: Splits: 0 Usplits: 0 Uisplits: 0 The program produces two output files: # import file for jstore $jout="/tmp/jstore.txt"; # output report from this run $out="/tmp/fixem.txt"; The jstore.txt file goes to Jonathan Damron for import into the SQL Server database The fixem.txt file is analysed for badly formatted "buy now" links that will need to be fixed by hand. Look for: Things that are "missing" in the jstore.txt file are correlate them with the run log log (fixem.txt). On a typical run there are 16 missing "kick-outs". Example: 1744 tpc missing missing missing missing 1816 tpc missing missing missing missing 2739 tpc 0158689739WP299 RIAP4 Plus CD-ROM Includes unlimited-use software and software Manual. $654.50 http://www.psychcorp.com/catalogs/paipc/psy104bpri.htm 2740 tpc missing missing missing missing 2741 tpc 0158689755WP299 RIAP4 TO RIAP 4 PLUS CD ROM UPGRADE $87.00 http://www.psychcorp.com/catalogs/paipc/psy104bpri.htm 2742 tpc missing missing missing missing 3291 tpc missing missing missing U 3292 tpc missing missing missing missing 3293 tpc missing missing missing n 3294 tpc missing missing missing missing 3518 tpc 0158048202WH299 Conditional Reasoning Test of Aggression (Complete Kit - 25 Applicants - Includes Manual, 25 Test Booklets, Answer Sheet, Scoring Key and Score Interpretation.) 165.00 http://www.psychcorp.com/catalogs/hra/hra018apsy.htm 3519 tpc missing missing missing missing 3520 tpc 0158048210WH299 Conditional Reasoning Test of Aggression (Manual) 50.00 http://www.psychcorp.com/catalogs/hra/hra018apsy.htm 3521 tpc missing missing missing missing 3522 tpc 0158048229WH299 Conditional Reasoning Test of Aggression (Test Booklets - Pkg. of 25 Test Booklets, Answer Sheet, Scoring Key and Score Interpretation.) 125.00 http://www.psychcorp.com/catalogs/hra/hra018apsy.htm For the ones that are "missing" for all fields the best thing that can be done is review the run log (fixem.txt) for that id number (example 1744) A good example of this is at http://kinscoe.harcourt.com/new/catalogs/sla/slaf044atpc.htm The original HTML is at http://www.psychcorp.com/catalogs/sla/slaf044atpc.htm or on emerson in /content/http/tpc/psychcorp/www/catalogs/sla/slaf044atpc.htm The section from the fixem.txt log file starts at the line: /usr/local/apache1_3/htdocs/tpc/catalogs/sla/slaf044atpc.htm SSI->

Workspace --> Len--> 0 Left--> 11 Right--> -1 TC--> -46 ISBN --> missing Len--> 0 Left--> 7 Right--> -1 TC--> -46 Title --> missing Len--> 0 Left--> 6 Right--> -1 TC--> -46 Price --> missing Len--> 122 Left--> 57 Right--> 110 TC--> 53 Link --> LeftStr -->

RightStr -->

NewStr -->

Len--> 140 Left--> -1 Right--> 2 TC--> 3 RightLink --> RLeftStr -->

RRightStr --> >

RNewStr -->

>

FINAL LINE:

>

The problem in the original HTML is that it was edited using a windows/dos based editor and transfered via ftp as binary meaning there was no new lines in unix. Here is a sample from emerson:/content/http/tpc/psychcorp/www/catalogs/sla/slaf044atpc.htm ^M

^M ^M ^M Notice the ^M at the end of each line. As a result the fixem Perlm script cannot determine new lines for this file and everything appears to be "missing". To find records for these in the run log (fixem.txt) search for "SBN --> missing" or "Title --> missing". The resulting testing sites are: Before (should be a mirror of production - http://www.psychcorp.com): http://kinscoe.harcourt.com/tpc/ and after: http://kinscoe.harcourt.com/new/ To publish this modified content to the production site # cd /usr/local/apache1_3/htdocs/new # tar cvf /tmp/newtpc.tar . # chmod 755 /tmp/newtpc.tar # scp /tmp/newtpc.tar kinscoe@wsweb01:/tmp/. On wsweb01: # cd /www/hosts/www.psychcorp.com/htdocs # tar xvf /tmp/newtpc.tar # find . -exec chown tpc:tpc {} \; # find . -exec chmod 755 {} \; Test it: http://www.psychcorp.com or http://www-p.psychcorp.com Last updated by K. Inscoe 8/15/2002