You could use pandoc to convert all the html files to plain text:
> for f in *.html; do pandoc "$f" -s -o "${f%.html}.txt"; done
Then cat all the files and pipe them to wc to get a word count:
> cat *.txt | wc
You could use pandoc to convert all the html files to plain text:
> for f in *.html; do pandoc "$f" -s -o "${f%.html}.txt"; done
Then cat all the files and pipe them to wc to get a word count:
> cat *.txt | wc
GNU social JP is a social network, courtesy of GNU social JP管理人. It runs on GNU social, version 2.0.2-dev, available under the GNU Affero General Public License.
All GNU social JP content and data are available under the Creative Commons Attribution 3.0 license.