Compass note

未踏の地へ踏み出すコンパス的エンジニアノート

Python で英語文章を要約する Sumy ライブラリを使ってみた

* 本ページはプロモーションが含まれています

Python で英語ニュースサイトの文章を要約する必要があり Sumy とう Python ライブラリを使って英語ニュースサイトの要約を作成できるようにしました。

github.com

  • Sumy の特徴
    • URL と要約結果の行数をパラメータとして与えるだけで実行可能

実行環境

bash-3.2$ uname -a
Darwin MacBook-Pro.local 20.3.0 Darwin Kernel Version 20.3.0: Thu Jan 21 00:07:06 PST 2021; root:xnu-7195.81.3~1/RELEASE_X86_64 x86_64
bash-3.2$ python -V
Python 3.7.7
bash-3.2$ pip -V
pip 20.0.2 from /usr/local/lib/python3.7/site-packages/pip (python 3.7)

英語文章要約の Sumy をインストール

bash-3.2$ pip install sumy
bash-3.2$ pip install git+git://github.com/miso-belica/sumy.git

自然言語処理ライブラリ NLTK をインストールする

NLTK(Natural Language Tool Kit)は自然言語処理のための Python で実装されたライブラリです。

Sumy 実行に必要になります。

bash-3.2$ python -c "import nltk; nltk.download('punkt')"

[nltk_data] Downloading package punkt to /Users/hiroaki/nltk_data...
[nltk_data]   Unzipping tokenizers/punkt.zip.

これで準備は整いました。

Python で英語文章を要約する Sumy ライブラリを使って Yahoo!.com ニュースを要約してみる

Yahoo!.com の英語ニュースを今回はようやくしてみました。

www.yahoo.com

  • --lenght=10:10行で要約
  • --url="XXX":要約したいサイトのURL
bash-3.2$ sumy lex-rank --length=10 --url="https://www.yahoo.com/entertainment/piers-morgan-meghan-markle-gmb-171747190.html"
A statement released from ITV read, "Following discussions with ITV, Piers Morgan has decided now is the time to leave Good Morning Britain.
Piers Morgan stormed off the set of Good Morning Britain following a heated discussion with co-host Alex Beresford about Meghan Markle.
Beresford added: "I understand that you don’t like Meghan Markle.
And I understand that you’ve got a personal relationship with Meghan Markle or had one and she cut you off.
I don’t think she has, but yet you continue to trash her."
Beresford then told Susanna Reid and viewers, "This is absolutely diabolical behavior.
Morgan returned to the studio to continue the discussion about racism with Beresford, Reid and Dr. Hilary Jones.
“I’m not trying to come on this show and take you down, tear you apart.
But when she began dating Prince Harry, she reportedly cut the former tabloid editor out of her social contacts.
Morgan has been continually outspoken in his condemnation of Markle and Prince Harry's decision to leave the royal family.

以下の 10 行で要約されています。


  1. A statement released from ITV read, "Following discussions with ITV, Piers Morgan has decided now is the time to leave Good Morning Britain.
  2. Piers Morgan stormed off the set of Good Morning Britain following a heated discussion with co-host Alex Beresford about Meghan Markle.
  3. Beresford added: "I understand that you don’t like Meghan Markle.
  4. And I understand that you’ve got a personal relationship with Meghan Markle or had one and she cut you off.
  5. I don’t think she has, but yet you continue to trash her."
  6. Beresford then told Susanna Reid and viewers, "This is absolutely diabolical behavior.
  7. Morgan returned to the studio to continue the discussion about racism with Beresford, Reid and Dr. Hilary Jones.
  8. “I’m not trying to come on this show and take you down, tear you apart.
  9. But when she began dating Prince Harry, she reportedly cut the former tabloid editor out of her social contacts.
  10. Morgan has been continually outspoken in his condemnation of Markle and Prince Harry's decision to leave the royal family.

Google 翻訳で日本語に変換してみます。

  1. ITVから発表された声明は、次のように述べています。
  2. ピアーズ・モーガンは、メーガン・マークルについて共同ホストのアレックス・ベレスフォードと熱く議論した後、グッドモーニング・ブリテンのセットを襲撃しました。
  3. ベレスフォードは次のように付け加えました。「あなたがメーガン・マークルを好きではないことを理解しています。
  4. そして、あなたがメーガンマークルと個人的な関係を持っているか、またはそれを持っていて、彼女があなたを断ち切ったことを理解しています。
  5. 彼女が持っているとは思わないが、それでもあなたは彼女をゴミ箱に捨て続けている。」
  6. その後、ベレスフォードはスザンナ・リードと視聴者に、「これは絶対に悪魔的な行動です。
  7. モーガンはスタジオに戻り、ベレスフォード、リード、ヒラリー・ジョーンズ博士と人種差別についての話し合いを続けました。
  8. 「私はこのショーに来てあなたを倒そうとはしていません。あなたをバラバラにします。
  9. しかし、彼女がハリー王子と付き合い始めたとき、彼女は元タブロイド編集者を彼女の社会的接触から切り離したと伝えられています。
  10. モーガンは、マークルとハリー王子が王室を去るという決定を非難したことで、絶えず率直に発言してきました。

今度は 3 行で要約してみます。

sumy lex-rank --length=3 --url="https://www.yahoo.com/entertainment/piers-morgan-meghan-markle-gmb-171747190.html"
A statement released from ITV read, "Following discussions with ITV, Piers Morgan has decided now is the time to leave Good Morning Britain.
And I understand that you’ve got a personal relationship with Meghan Markle or had one and she cut you off.
Morgan has been continually outspoken in his condemnation of Markle and Prince Harry's decision to leave the royal family.

  1. A statement released from ITV read, "Following discussions with ITV, Piers Morgan has decided now is the time to leave Good Morning Britain.
  2. And I understand that you’ve got a personal relationship with Meghan Markle or had one and she cut you off.
  3. Morgan has been continually outspoken in his condemnation of Markle and Prince Harry's decision to leave the royal family.

10 行と 3 行を見比べると「要約率」が上がるわけではなく、10 行から間引かれて 3 行が抽出されました。


  1. ITVから発表された声明は、次のように述べています。
  2. そして、あなたがメーガンマークルと個人的な関係を持っているか、またはそれを持っていて、彼女があなたを断ち切ったことを理解しています。
  3. モーガンは、マークルとハリー王子が王室を去るという決定を非難したことで、絶えず率直に発言してきました。

原文はこちら。

かなり長いですが、サクッと要約してくれましたね。


Piers Morgan leaving 'Good Morning Britain' after storming off set

pdate: Hours after leaving the set of Good Morning Britain, Piers Morgan is leaving his position. A statement released from ITV read, "Following discussions with ITV, Piers Morgan has decided now is the time to leave Good Morning Britain. ITV has accepted this decision and has nothing further to add." This comes after Ofcom, the U.K.'s media regulator, launched an investigation into Morgan. More than 41,000 people wrote in to complain Morgan’s Meghan Markle comments.

Piers Morgan stormed off the set of Good Morning Britain following a heated discussion with co-host Alex Beresford about Meghan Markle.

The pair were discussing the Duke and Duchess of Sussex's interview with Oprah Winfrey, which aired in the U.S. on Sunday night and in the U.K. on Monday night, when former weather presenter Beresford accused Morgan of having a vendetta against Markle because she had ended their friendship after becoming engaged to Prince Harry.

“I hear Piers say William has gone through the same thing, but do you know what? Siblings experience tragedy in their life and one will be absolutely fine and brush it off and the other will not be able to deal with it so strongly and that is clearly what has happened with Prince Harry in this situation,” said Beresford said. “He walked behind his mother’s coffin at a tender, tender age in front of the globe. That is going to shape a young boy for the rest of his life,” continued Beresford. “So I think that we all need to take a step back.”

Beresford added: "I understand that you don’t like Meghan Markle. You’ve made that so clear a number of times on this program. A number of times. And I understand that you’ve got a personal relationship with Meghan Markle or had one and she cut you off. She’s entitled to cut you off if she wants to. Has she said anything about you since she cut you off? I don’t think she has, but yet you continue to trash her."

Morgan then got up from his seat and began to walk out saying, "OK. I’m done with this. Sorry. No. Sorry. You can trash me mate, but not on my own show."

He muttered something to Beresford, indecipherable to viewers, as he passed him.

Beresford then told Susanna Reid and viewers, "This is absolutely diabolical behavior. Piers spouts off on a regular basis. He has the ability to come here and talk from a position where he doesn't fully understand."

The show then went to a break, with Reid saying, "I think we should just pause there."

Morgan returned to the studio to continue the discussion about racism with Beresford, Reid and Dr. Hilary Jones. At the end of the segment, Morgan made a joke about healing family rifts in relation to himself and Beresford.

“I’m not trying to come on this show and take you down, tear you apart. Just because we’re on the same side, doesn’t mean we have to have the same view,” Beresford responded. “This whole situation is very personal for me and I’m by no way, shape or form accusing you of being a racist,” Beresford added. “I have the luxury of knowing you on and off-screen and we’ve had conversations, I know where you stand on this and I have a great amount of respect for you, Piers.”

He continued, “I’m tired of finding a different way to explain not to you, but to so many people on why what has been said is so wrong. I’ve walked into institutions as the only person of color and experienced covert and overt racism on so many occasions and why the Meghan interview really resonates with me is because an ex-work colleague — not on this show — asked me if I was worried about the shade of cocoa that my son was going to come out. So I fully understand the hurt that is behind all of that.”

Morgan has now tweeted in defense of his actions: "I just prefer not to sit there listening to colleagues call me diabolical."

Morgan has previously spoken about how he was friends with Markle when she first came to the UK as an actor. But when she began dating Prince Harry, she reportedly cut the former tabloid editor out of her social contacts.

Morgan has been continually outspoken in his condemnation of Markle and Prince Harry's decision to leave the royal family. He has been dismissive of Meghan's claims she received racist treatment by the British media and also mocked her claims in her interview with Oprah that she had suicidal thoughts while pregnant with Archie.